Skip to main content

This glossary provides clear definitions for the core terms and concepts used across the QUANTUM project, based on the EHDS Regulation and project documentation.

 


A

Accessibility

Accessibility refers to the dataset being accompanied by clear and transparent access and usage conditions.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Accuracy

Accuracy refers to the degree to which observations correctly describe what it was designed to measure.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Article 78 of the EHDS Regulation

Article 78 establishes a Union data quality and utility label that may be provided by data holders and is mandatory for certain publicly funded datasets made available for secondary use via Health Data Access Bodies. The label standardises structured information on datasets, covering documentation, technical data quality, data quality management processes, coverage, access conditions, and data enrichments. In the context of QUANTUM, this provision is key as it creates the regulatory basis for a harmonised description of data quality and utility, which QUANTUM operationalises through its label to support transparent and comparable assessment of datasets used for secondary use.
Source: EHDS Regulation, Article 78

 


C

Coherence 

Coherence is defined as the dimension that expresses how different parts of the dataset are uniform in their representation and meaning over time, such as formats, semantics (stability of the data models), and methods.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Consistency

Consistency refers to the degree to which data has attributes that are plausible and are uniform with other data and over time.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Completeness

Completeness refers to the degree to which all information that could be available is present in a particular dataset.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Compliance

Compliance refers to the degree to which data has attributes that adhere to ethical standards, conventions, protocols, or regulations.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

 


D

Data provenance

Data provenance means a description of the source of the data, including context, purpose, method and technology of data generation, documenting agents involved in the provenance of data, data validation routines, source data verification, traceability of changes, and quality control of data.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Data quality:

Data quality means the degree to which the elements of electronic health data are suitable for their intended primary use and secondary use.
Source: EHDS Regulation, Article 2(2) point (z)

 

Data quality and utility label:

Data quality and utility label means a graphic diagram, including a scale, describing the data quality and conditions of use of a dataset.
Source: EHDS Regulation, Article 2(2) point (aa)

 

Dataset:

A structured collection of electronic health data. (Source: EHDS Regulation, Article 2(2) point (w))

A Dataset is a structured collection of (electronic health) data, published or curated by a single source, and available for access or download in one or more formats. In the context of the EHDS Regulation, access to such datasets must adhere to principles of data minimisation and purpose limitation, ensuring that only the data relevant and necessary for the specified processing purpose is provided, in either anonymised or pseudonymised format depending on the feasibility of achieving the processing objectives.

Examples of datasets:

  • A simple dataset: Tabular data, e.g., a single database, a CSV file, where data is organised in a structured format of rows and columns, where each row represents a single record or entity, and each column represents a specific attribute or variable. This structure is commonly found in spreadsheets or relational databases, making it easy to store, query, and analyse. Tabular data is often used for structured datasets where relationships between variables are well-defined
  • A complex dataset: A set of three databases related to a specific cohort of cancer patients:
    • Relational database of the patient’s medical history;
    • A set of the patients’ cancerous cell images;
    • The logs of the microscope used to generate these images at the moment of the examination.
    • These three databases can be used together via shared identifiers (patient identifier in medical history & cancerous cell images, images belonging to batches whose identifier can be found in both images & microscope logs).

 

Dataset description:

A description in the form of metadata of the available datasets and their characteristics.
Source: EHDS Regulation, Article 77(1)

 


E

European Health Data Space (EHDS):

An EU‑wide framework established by Regulation (EU) 2025/327 that sets common rules, standards, and infrastructure for the secure access, sharing, and reuse of electronic health data across Member States. It supports both primary use (healthcare delivery and patient access) and secondary use (research, innovation, policy‑making, and regulation), while ensuring strong data protection, interoperability, and governance.
Source: EHDS Regulation

 


F

Fitness-for-purpose (F4P):

Fitness-for-purpose refers to the actual utility of an accessed dataset according to the objective of the health data request. The information necessary to evaluate fit-for-purpose depends on the user’s experience. This information needs to be added to the description of a dataset as fit-for-purpose metadata.

Source: QUANTUM deliverable 1.2 Specification for the assessment of data holders´ maturity

 

Fitness-for-use (F4U):

Fitness-for-use is equivalent to a dataset's “potential” utility, typically information that describes the dataset in a standard manner, such as source, provenance, or lineage, content, distribution, etc. Fit-for-use talks about the “potential” utility of a dataset, while the actual utility can only be informed by the user experience — in QUANTUM, the actual utility is referred to as fit-for-purpose (see above).

Source: QUANTUM deliverable 1.2 Specification for the assessment of data holders´ maturity

 


H

Health Data Access Body (HDAB):

An HDAB is a member state-designated authority that facilitates the secondary use of electronic health data. HDABs assess the information provided by the health data applicant and decide on health data requests and access applications, authorise and issue data permits, obtain data from data holders, and make data available in secure processing environments. HDABs systematically track the data request and data access applications received and the data permits issued.
Source: EHDS Regulation, Article 55 and Recital 52

 

Health Data Holder:

Any person, organisation, or public body that has the right to process electronic health data in the context of healthcare, care services, health‑related products, wellness applications, or health research. Health data holders may process data for healthcare provision, public health, reimbursement, research, policy‑making, official statistics, or patient safety. This includes, for example, hospitals, insurers, research institutes, and EU institutions.
Source: EHDS Regulation, Article 2(2)(t)

 

Health Data User:

Any natural or legal person which has been granted access to electronic health data for secondary use under the EHDS. Health data users may include researchers, innovators, public authorities, regulators, statistical bodies, or industry actors. They must submit a data access application to the relevant Health Data Access Body (HDAB), demonstrate a permitted purpose, and process the data only within secure processing environments in accordance with the EHDS Regulation.
Source: EHDS Regulation, Article 2(2)(u)

 

HealthData@EU:

HealthData@EU is the EU-level infrastructure established under the European Health Data Space to enable the cross-border discovery, access, and reuse of health data for secondary use. It connects national Health Data Access Bodies through a federated architecture, providing common services, standards, and procedures to support secure data sharing for research, innovation, and policy-making. HealthData@EU requires consistent and transparent information on data quality across countries. QUANTUM contributes to this aspect by developing a harmonised Data Quality and Utility Label, enabling data users to understand key characteristics and limitations of datasets accessed via HealthData@EU, and supporting data holders in aligning with common quality expectations.

 


M

Maturity (of Health Data Holders):

In QUANTUM, maturity is a feature at the data holder level. It refers to the level of automation of data and data quality management procedures.

 

Metadata:

A structured description of the contents or the use of data facilitating the discovery or use of that data.
Source: Data Act, Article 2(2)

 

Metadata scope

Metadata scope refers to the availability, comprehensiveness, level of detail of metadata, and data dictionary that help users understand the data being used.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 


P

Population coverage

Population coverage refers to the degree to which a dataset includes the potential eligible population.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Population representativity

Population representativity refers to the degree to which the data adequately represent the population in question.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 

Precision

Precision refers to the degree of approximation by which data can represent reality.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.

 


S

Secondary use of health data:

Processing of electronic health data for the purposes set out in Chapter IV of the EHDS Regulation, other than the initial purposes for which they were collected or produced.
Source: EHDS Regulation, Article 2(2)(e)

In the context of the European Health Data Space, secondary use refers to the processing of health data for purposes other than the direct care of an individual patient, such as research, innovation, public health, policy-making, regulatory activities, and health system planning, under defined legal and governance conditions. Because these data were not originally collected for such uses, their characteristics, limitations, and level of quality can vary significantly; QUANTUM focuses specifically on this context by developing a harmonised approach to describing and assessing data quality and utility for secondary use, supporting transparent and informed reuse across settings.

 


U

Utility:

Utility refers to how well the data supports its intended use, such as syntactical testing, analytical tasks, decision-making, or machine learning model performance. In the context of anonymised and synthetic data, high utility means that insights, predictions, or outcomes derived from the data closely match those obtained using the original data.

 


V

Validity

Validity refers to the degree to which representations of data in a dataset conform to the specification of a data model or data model.
Source: QUANTUM deliverable 1.1 Specification of the data sets' quality and utility label.