We live in an era of big data. Major corporations around the world are competing to collect data about their customers to provide better services to them and stay on top of the pyramid. Hot skills and niches related to the field of information and technology are within the field of data science and analytics. Even with the current emphasis on data and its processing around the world, the field of tribology and research lacks access to reliable experimental data and standards. The presence of Findable, Accessible, Interoperable and Reusable data (FAIR) would greatly help the research community of tribology to effectively collaborate and produce quality research. The problem with generating FAIR data in the field of tribology in particular is the broadness of the experimentation involved. Most of the experiments carried out can be situation specific and can be largely influenced by the setup involved. Apart from the obvious differences in the experimentations, the scientific background of the individuals interpreting the data can also play a key role.
Generating FAIR data isn’t just going to ease the collaboration between researchers but also help ease up the data filtering process. Predefined standards in the database will help users to gain confident data and skip the painstaking process of data collection and add value to their research by increasing data trustworthiness. This process can be sped up multiple folds by adding a computer algorithm equipped with machine learning. With the knowledge of previous research it is found that a standard deviation as low as 14% is observed within different experiments. Major influencing factors being the type of setups used, differences in the preparations of specimens, operating conditions and experience of the operator. Two probable solutions to overcome this situation is to limit the scope of ontology into narrow and well defined boundaries to better segregate experiments and also remove the influence of the experiment operator.
Fig1. Component diagram of FAIR data protection.
Importance of Ontology and ELN
Usage of Electronic lab notebooks (ELN’s) is arguably a solid proposal in this research to streamline data handling. This will reduce human intervention in data noting and saves considerable amount of time. Majorly it can relate all the associated metadata in its data, helping the authors to find useful references. It can also mark unique identifiers to necessary objects for easy sorting and spotting. Additionally, researchers can also publish their observed data without any additional effort.
Building a reliable ontology to sort and map data is an important feature for any ELN. The considered data should be interpolable, to improve accuracy even with few data points. This can be done with the provision of a scalable environment. Most of all, it should have the ability to support the building of a knowledge graph based on the input data. Building an ontology with vocabulary requires linguistic experience and curation. Generality and specificity should be balanced with the help of picking the right terms. Attention to detail while building class hierarchy is also a crucial step to be considered. In a nutshell, ontology for a particular type of experiment is an essential part of generating FAIR data. At the same time building one will require experts’ help and experience.
A showcase pin-on-disc experiment is conducted as a part of this research to record FAIR data. The infrastructure that is used in this experiment significantly reduced the efforts of data entry and documentation. Moreover publishing this data is easier now opening numerous possibilities of cross collaborations and future investigations on similar setups. It is important to understand the level of detail required in description, data and metadata is up to the domain experts. Obviously one has to be sure that the data provided is sufficient to be included in the database and handle it. This is an important part on the shoulders of experienced experts in that particular field before hard rules, restrictions and regulations are placed.
Fig2. Outline of TriboFAIR ontology
It is important to understand that the proposed framework requires intense collaboration from both internal and external users. The research demo experiment is a useful example that laid down the important features of using FAIR data. Descriptive information of ontology development and the use of ELN can be a reference for future users to effectively build and utilize FAIR data within the scientific community. Overall, FAIR data generation could definitely be a huge advantage for collaborations and references which ultimately helps the advancement of science.
Generating FAIR research data in experimental tribology, Nikolay T. Garabedian, Paul J. Schreiber, Nico Brandt, Philipp Zschumme, Ines L. Blatter, Antje Dollmann, Christian Haug, Daniel Kümmel, Yulong Li, Franziska Meyer, Carina E. Morstein, Julia S. Rau, Manfred Weber, Johannes Schneider, Peter Gumbsch, Michael Selzer & Christian Greiner. https://doi.org/10.1038/s41597-022-01429-9