Data can be defined as any information gathered for future reference or analysis and can exist in various forms such as numbers, text, audio, pictures, or graphs. Data can be presented as raw data, summarized or compiled data, graphical data, or meta data. Regardless of how data is collected, stored, or presented, it must follow the ALCOA+ principles of Data Integrity to ensure that data is a true and accurate representation of the products created and processes followed by Life Sciences companies.
A – Attributable – must be traceable back to the person or system generating the data
L – Legible – must be readable and permanent
C – Contemporaneous – must be collected and recorded at the time generated
O – Original – must be the primary data collected or recorded for the first time
A – Accurate – must be error free, truthful and reflective or the observation
Plus (+) – These attributes were added to the ALCOA framework to further define Data Integrity requirements
Complete – must contain all data observed or recorded and nothing should be removed, deleted or modified
Consistent – must be presented in a logical manner (chronological, sequential, etc.) and should include time and date stamps for each entry or set of entries
Enduring – must be stored in a manner that ensures the data is accurately reproducible for the period of time defined by regulatory requirements (Predicate Rule)
Physical – Is the data protected during collection, storage, summarization, and retrieval? If stored electronically, are the servers in a secure location? If stored physically (on paper), are those files protected from water, fire, or theft?
Entity – Is the architecture of the database designed in such a way to ensure the data is stored and used in a consistent manner. Do table links, primary keys, unique identifiers and unique values ensure data is unique, consistent, and complete?
Domain – Do the properties of an individual table affect the values captured in that table? Are there constraints on the amount, length, type, or format of the values that can affect the accuracy of the data?
Referential – is the data contained in a database or set of tables used in a logical and uniform manner? Do the rules defining table structure, linking and retrieval ensure the meaning and intent of the data is not changed as a result of the database structure?
User Defined – Do the rules and restrictions defined by users align with business and regulatory requirements? When Referential and Entity controls cannot enforce specific requirements, user defined requirements may be implemented to ensure that data remains consistent and reliable.
Here are a couple of definitions from regulatory bodies:
FDA Publication, “Data Integrity and Compliance With Drug CGMP”
Data integrity refers to the completeness, consistency, and accuracy of data. Complete, consistent, and accurate data should be attributable, legible, contemporaneously recorded, original or a true copy, and accurate (ALCOA).
MHRA GxP Data Integrity Definitions and Guidance for Industry
Data integrity is the degree to which data is complete, consistent, accurate, trustworthy, and reliable, and that these characteristics of the data are maintained throughout the data life cycle. The data should be collected and maintained in a secure manner, so that they are attributable, legible, contemporaneously recorded, original (or a true copy) and accurate. Assuring Data Integrity requires appropriate quality and risk management systems, including adherence to sound scientific principles and good documentation practices.
The primary reason to be concerned with Data Integrity is because the products and processes that define the Life Sciences industry are based on interpretation of data. If that data does not meet the ALCOA+ principles listed above, there is no assurance those products will meet the quality requirements of the consumer.
As reliance on computerized systems has increased, so too has the need for increased scrutiny around these systems. As a result, regulatory citations related to Data Integrity have increased significantly. Here are a few examples of recent 483 citations and Warning Letters issued by the agency.
In our next post, we will talk about the benefits of maintaining Data Integrity and the main causes of Data Integrity failures.
This publication contains general information only and Sikich is not, by means of this publication, rendering accounting, business, financial, investment, legal, tax, or any other professional advice or services. This publication is not a substitute for such professional advice or services, nor should you use it as a basis for any decision, action or omission that may affect you or your business. Before making any decision, taking any action or omitting an action that may affect you or your business, you should consult a qualified professional advisor. In addition, this publication may contain certain content generated by an artificial intelligence (AI) language model. You acknowledge that Sikich shall not be responsible for any loss sustained by you or any person who relies on this publication.