A Brief Guide to...
A Wish List for ICH Q2(R2), the Revision of the Guideline on Method Validation
The ICH guideline on method validation, Q2 (R1) , is currently being updated and I thought it might be interesting to put together a wish list for the changes that I would like to see in the new version. Rather than a wide-ranging treatment of all analytical method validation related issues, such as how it fits into the lifecycle management of the analytical method, or a discussion of analytical quality by design or measurement uncertainty, I have confined myself in this article to changes in the current text that would result in a more up to date, and scientifically and statistically sound guideline.
These suggestions are based on experience of using the guideline from when it was first published. Part I (Parent Guideline) was published in 1994 and was then followed up with Part II (Methodology) in 1996. During my years working as an analytical chemist in the pharmaceutical industry I have completed many method validation studies and for the last 13 years I have worked as a trainer and consultant in the area of chemical analysis of pharmaceuticals and biopharmaceuticals, which has involved delivering numerous training courses on the topic on method validation, and providing advice to customers on how to approach method validation studies.
At the outset it is important to make clear that although I am suggesting changes to the guideline, in my opinion, it is an excellent resource and the original working party achieved great success in finding agreement on a difficult topic that resulted in a document which has standardised the way we approach method validation across our industry and throughout the world. Many of my suggested changes are due to the age of the guideline, Part I now being 26 years old, and reflect changes in both in the terminology used in analytical chemistry, and the current requirements of GMP in quality control laboratories. I haven’t tried to prioritise the list, so the order does not reflect any particular order of importance.
1. Rename ‘validation characteristics’ as ‘method performance characteristics’ in line with current practice, thus recognising that the characteristics apply to analytical methodology from the beginning of method development and throughout the method lifecycle.
2. Change the characteristic heading, ‘specificity’, to ‘selectivity’ but retain definitions of both terms in the text. As per the note in USP <1225> , ‘Other reputable international authorities (IUPAC, AOAC-I) have preferred the term “selectivity”, reserving “specificity” for those procedures that are completely selective.’
Very few analytical methods are specific and, although specificity is desirable for identification methods, for quantitative methods, appropriate selectivity is sufficient (i.e., any interferences present are not considered to be significant).
3. Insert information regarding validation of stability indicating methods. For example, this is stated well in the FDA Guidance for Industry, Analytical Procedures and Methods Validation for Drugs and Biologics : ‘If a procedure is a validated quantitative analytical procedure that can detect changes in a quality attribute(s) of the drug substance and drug product during storage, it is considered a stability-indicating test. To demonstrate specificity of a stability-indicating test, a combination of challenges should be performed. Some challenges include the use of samples spiked with target analytes and all known interferences; samples that have undergone various laboratory stress conditions; and actual product samples (produced by the final manufacturing process) that are either aged or have been stored under accelerated temperature and humidity conditions.’
The stress conditions provided in ICH Q2 are routinely used for the purposes of forced degradation studies performed to support the validation of stability indicating methods. However, this is not the purpose for which they are provided. They are provided to allow comparison with a well characterised second method as a means of demonstrating specificity. A second method is rarely available. In my experience, I have never come across a situation where one was available for an impurities method.
In the example provided for comparison to a second well characterised method, i.e., a pharmacopoeia method, I don’t believe that it would be appropriate to use this approach either under specificity, or accuracy where it is also mentioned. This is because the acceptance criteria that were applied, and the results that were obtained for the validation of a pharmacopoeia method are not available and therefore there is not enough information to assess the capability of the method being validated.
4. Change the sentence: ‘Peak purity tests may be useful to show that the analyte chromatographic peak is not attributable to more than one component (e.g., diode array, mass spectrometry)’ to: ‘Peak purity tests may be useful to investigate whether the analyte chromatographic peak is attributable to more than one component.’ The current text implies that peak purity tests can show that peaks are pure which is not the case.
5. Change the characteristic heading ‘linearity’ to ‘calibration’ and include more information on assessing the suitability of a calibration approach. Using the term ‘linearity’ as a header implies that it is the only important attribute of the method calibration that is important. Although linearity is important, and most methods are developed using known linear relationships, it is the suitability of the calibration that should be evaluated, not just whether it is linear. In addition, methods related to biopharmaceuticals often do not have linear relationships.
6. Remove the requirement to submit the residual sum of squares. In isolation, this parameter does not provide particularly useful information.
7. Move the sentence: ‘an analysis of the deviation of the actual data points from the regression line may also be helpful for evaluating linearity’ to the previous paragraph where the evaluation of linearity is being discussed. Placing it after the requirements for submission makes it read like an afterthought, rather than one the best ways of evaluating whether a relationship is linear.
8. Insert a recommendation regarding the evaluation of the significance of the intercept for single point calibration methods (which assume that the calibration is linear through zero).
9. Revise the example provided for the range of a dissolution method. The suggestion of validating over the range 0-110% of the label claim is not practical since zero cannot be validated as the extreme of a range for an analytical method. The alternative would be to validate from QL which is unnecessarily complicated for a dissolution method which reports integer values of % label claim. It would make more sense to start at 1% or 5% for the example given.
10. Clarify that the range required for immediate release dissolution tests should cover all possible in-specification values as defined by stage 2 and 3 testing and then add ±20% to allow for the quantification of out of specification results.
11. In the range for the determination of an impurity, change the term ‘reporting level’ to ‘reporting threshold’ in line with ICH Q3.
12. Provide additional guidance for the situation described in ICH Q2 as: ‘if assay and purity are performed together as one test and only a 100% standard is used, linearity should cover the range from the reporting level of the impurities to 120% of the assay specification.’ Although it is true that this linear range should be evaluated, it is also true that regression analysis may ignore the low values in order to fit the higher values and distributing only five concentrations across this range may not be the best statistical approach.
13. Define accuracy using the ISO definition and introduce the ISO definition for ‘trueness’. USP <1225> has a note on this terminology: ‘The definition of accuracy in 〈1225〉 and ICH Q2 corresponds to unbiasedness only. In the International Vocabulary of Metrology (VIM) and documents of the International Organization for Standardization (ISO), “accuracy” has a different meaning. In ISO, accuracy combines the concepts of unbiasedness (termed “trueness”) and precision.’
If the ICH Q2 working party want to continue to use accuracy as a term for unbiasedness, then they will not be recognising that there is a difference between individual replicates (subject to random and systematic errors) and the mean of those replicates (an expression of the systematic error) in an accuracy study. The ISO definition allows a more complete understanding of the sources of error in the analytical method and thus contributes to the ideals of AQbD.
14. Introduce guidance on experiments to be performed for intermediate precision in terms of minimum numbers of runs and replicates. I suspect that the reason that requirements for intermediate precision experiments are not defined in the guideline is due to difference of opinion on the most appropriate approach in the working party and it continues to be difficult to achieve agreement on what should be done. As a consultant, I see a wide variety of approaches taken. I think that a risk based approach would be the most scientific way to define the experiments, but I fear that it would be interpreted by many as an opportunity to do as little as possible. Unfortunately if too few intermediate precision experiments are performed then the actual variability of the method is not known by the users and may not be fully understood until many OOS investigations later when it is discovered that the method is less precise than originally thought.
15. Reconsider the characteristics defined for limit tests for impurities in table 3. Currently these are detection limit and specificity. However, most limit tests are based on quantitation limit and not detection limit.
16. Add the footnote ‘(3) may be needed in some cases’ to the ‘+’ in table 3 that is currently indicating that quantitation limit should be evaluated for impurities methods. This contradicts the information in the range section where it is stated that the range investigated is as follows: ‘for the determination of an impurity: from the reporting level of an impurity to 120% of the specification’. If the low point of a range is defined by the reporting level and it is shown that the method is accurate, precise and linear over the range, then evaluation of the QL is not necessary. Interestingly, this tends to be assumed for residual solvents and elemental impurities methods, where the low point of the range is usually defined based on the control limit (e.g., 50% of the control limit) but in related substances methods the QL is often investigated even though no value below the reporting threshold is ever reported.
17. Change ‘Several approaches for determining the quantitation limit are possible’ to ‘Several approaches for estimating the quantitation limit are possible’. The definition of QL includes the requirement that it has: ‘suitable precision and accuracy’. The three approaches suggested in the guideline provide an estimate of the QL, which should be confirmed using accuracy/precision experiments, therefore it cannot be ‘determined’ using these techniques.
18. Update the definition of robustness to include other factors which are not ‘deliberate variations in method parameters’ but are important to consider during robustness investigations. An example is batch to batch differences in chromatography columns.
19. Add robustness to the table on page 3. The reason for not including it originally was given as: ‘It should be noted that robustness is not listed in the table but should be considered at an appropriate stage in the development of the analytical procedure.’ This could equally be applied to the other characteristics since they also should be considered during method development in a lifecycle management, quality by design, approach to analytical methods, therefore it would be less confusing to include robustness in the table with the other characteristics.
It appears that the working party did not anticipate that robustness would be investigated in a validation study using formally defined experiments with predefined acceptance criteria but this has evolved as the most common approach for how robustness is evaluated. The ‘tick-box’ approach of lifting only the factors listed in the guideline and altering them by ±10%, without fully evaluating what other factors are important in the method, is unfortunately all too common but I don’t think that the guideline can be said to be at fault for the way people are implementing it.
I haven’t added using the terminology of ‘method’ rather than ‘procedure’ to my wish list, even though I have used the term ‘method’ throughout this article, since I realise that it was a considered decision of the original working party. However, I have yet to come across a company who routinely use the term, ’procedures’ to describe analytical methods.
The list has ended up much longer than I had originally anticipated! What do you think? Would you disagree or add anything to my wish list? Please let me know, your comments are very welcome.
You can comment on this opinion article in LinkedIn, click here
ICH Q2 (R1): Validation of Analytical Procedures: Text and Methodology, 2005, ww.ich.org
USP <1225> Validation of Compendial Methods, www.usp.org
FDA Guidance for Industry: Analytical Procedures and Methods Validation for Drugs and Biologics, 2015, www.fda.gov