Following the integration of automated workflows, as well as the rapid expansion of AI in research, the acceleration of science has become heavily dependent on open and stream-lined access to research data. This has turned good and consistent data management into an imperative. As the objective to make data not only open but FAIR (i.e., findable, accessible, interoperable, and reusable) has become prominent, so has the creation of a Data Management Plan (hereinafter “DMP”) become an obligation. In fact, a well-prepared DMP has become one of the core prerequisites for increasing awareness of research data and its importance is thus emphasized by its obligatory submission being requested by the EC within the first six months for European Research Council and Horizon Europe projects.

In the quest for making research data findable, accessible, interoperable and re-usable, a well-maintained DMP includes, among others, information concerning the following:

  • the handling of research data during and after the end of the project; what data is to be collected, processed and/or generated;
  • the methodology and standards to be applied;
  • if data is to be shared/made open access; the manner of curation and preservation of data, including after the end of the project.

More particularly, the core directions of the FAIR Guiding Principles framework are hereinafter presented.

“To be Findable:

  • F1. (meta)data are assigned a globally unique and persistent identifier;
  • F2. data are described with rich metadata (defined by R1 below);
  • F3. metadata clearly and explicitly include the identifier of the data it describes;
  • F4. (meta)data are registered or indexed in a searchable resource.

To be Accessible:

  • A1. (meta)data are retrievable by their identifier using a standardized communications protocol;
  • A1.1 the protocol is open, free, and universally implementable;
  • A1.2 the protocol allows for an authentication and authorization procedure, where necessary;
  • A2. metadata are accessible, even when the data are no longer available.

To be Interoperable:

  • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation;
  • I2. (meta)data use vocabularies that follow FAIR principles;
  • I3. (meta)data include qualified references to other (meta)data.

To be Reusable:

  • R1. meta(data) are richly described with a plurality of accurate and relevant attributes;
  • R1.1. (meta)data are released with a clear and accessible data usage license;
  • R1.2. (meta)data are associated with detailed provenance;
  • R1.3. (meta)data meet domain-relevant community standards.”

All in all, the core rationale behind the principles is to avail the framework characteristics that should be exhibited by data resources, tools, vocabularies, and infrastructures, to allow for the discovery and reusability by third parties, i.e., to render data re-usable for those who have not participated in the original generation of the data. The relevance of maintaining good data management is further reinforced by the fact that it allows not only for the proper storage of data related to published discoveries, but also for developing further research knowledge on previously generated research data.

It should be noted that even though the FAIR principles are at the forefront of a good DMP, this is not the only aspect to be considered when ensuring good management of data. The document must be specific also in terms of providing a good data summary, and be compliant concerning security, as well as legal/ethics aspects, namely:

  • stating the purpose of the data collection/generation;
  • explaining the relation to the objectives of the project;
  • specifying the types and formats of data generated/collected;
  • specifying whether existing data is being re-used;
  • specifying the origin of the data;
  • stating the expected size of the data;
  • outlining the data utility, i.e., to whom it shall be of use.
  • data security, as well, should be considered carefully, in terms of addressing data recovery, as well as securing storage and transfer of sensitive data;
  • legal/ethics aspects need to be covered, as well (that may have an impact on data sharing), in the sense of an ethics review, reflecting also the ethics section of the description of action within the given project, as well as in the context of the deliverables/requirements dedicated on ethics considerations.

The relevance of ensuring the existence of a FAIR DMP within a Horizon Europe project and beyond the term of the same has thus become obvious. After all, as argued by Wilkinson and Dumontier in their publication on The FAIR Guiding Principles for scientific data management and stewardship:

“[G]ood data management is not a goal in itself, but rather is the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse by the community after the data publication process”.

*The drafting of this blog post is funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them.