Dear ccp4 community, I am currently working on some low resolution datasets (around 4.5A). The space group seems to be P21, as suggested by XDS and pointless. I have collected many datasets of these crystals, both native as well as SeMet-labeled. Using MR-SAD I have been able to obtain a clearly interpretable electron density map for all features I expect and heavy atom sites that make sense for both the model used in MR and the yet unmodeled components. So far, so good.
While routinely analyzing my datasets with Phenix Xtriage, I have noticed that the intensity statistics for all of these datasets look unusual. In fact, Xtriage complains about this with the message: „The intensity statistics look unusual, but twinning is not indicated or possible in the given space group“ when processed in P21. The occurence of this message depends somewhat on the typ of input file I use for the same dataset as well as the input parameters (high resolution cut-off). If I use XDSCONV to convert the intensities to amplitudes for phenix, this message appears. If I use the output of XSCALE directly as intensities, this message does not appear, yet the actual statistics are somewhat similar. I have attached the log file output for four scenarios at the end of this message (P21 intensities, P21 amplitudes, P1 intensities, P1 amplitudes). These results got me questioning whether the true space group is really P21, or whether it could be that it is P1 with some twinning issue. Since the Xtriage output regarding the „normality“ of the intensity statistics varies upon the input format, I assume that this case may be somewhat borderline. Since I have very little experience both with low-resolution crystals as well as with twinning, I am a bit unsure how to proceed with this data. How can I distinguish between a partially twinned P1 crystal and an untwinned P21 crystal? It is my impression from previous discussions here that distinguishing twinned from untwinned data simply by comparing refinement results with and without twin laws is not always conclusive, as the R-factors are not directly comparable. If the crystal is truly P21, could these issues arise from intensity to amplitude conversion problems? (Xtriage also suggests this as a possibility) If so, can these be overcome? Or could the deviation from ideal intensities simply originate from the low quality (= resolution) of the data and are within the range of tolerance for such a dataset? Could this be some type of pseudosymmetry issue? And finally, what I would be very grateful for any advice on how to proceed with these data! Kind regards, Hauke Processed as P21, intensity input: =============== Diagnostic tests for twinning and pseudosymmetry ============== Using data between 10.00 to 3.50 Angstrom. ----------Patterson analyses---------- Largest Patterson peak with length larger than 15 Angstrom: Frac. coord. : 0.164 0.000 -0.021 Distance to origin : 17.720 Height relative to origin : 3.072 % p_value(height) : 1.000e+00 Explanation The p-value, the probability that a peak of the specified height or larger is found in a Patterson function of a macromolecule that does not have any translational pseudo-symmetry, is equal to 1.000e+00. p_values smaller than 0.05 might indicate weak translational pseudo symmetry, or the self vector of a large anomalous scatterer such as Hg, whereas values smaller than 1e-3 are a very strong indication for the presence of translational pseudo symmetry. ----------Wilson ratio and moments---------- Acentric reflections: <I^2>/<I>^2 :1.935 (untwinned: 2.000; perfect twin 1.500) <F>^2/<F^2> :0.805 (untwinned: 0.785; perfect twin 0.885) <|E^2 - 1|> :0.696 (untwinned: 0.736; perfect twin 0.541) Centric reflections: <I^2>/<I>^2 :2.431 (untwinned: 3.000; perfect twin 2.000) <F>^2/<F^2> :0.733 (untwinned: 0.637; perfect twin 0.785) <|E^2 - 1|> :0.812 (untwinned: 0.968; perfect twin 0.736) ----------NZ test for twinning and TNCS---------- The NZ test is diagnostic for both twinning and translational NCS. Note however that if both are present, the effects may cancel each other out, therefore the results of the Patterson analysis and L-test also need to be considered. Maximum deviation acentric : 0.028 Maximum deviation centric : 0.103 <NZ(obs)-NZ(twinned)>_acentric : -0.009 <NZ(obs)-NZ(twinned)>_centric : -0.061 ----------L test for acentric data---------- Using difference vectors (dh,dk,dl) of the form: (2hp, 2kp, 2lp) where hp, kp, and lp are random signed integers such that 2 <= |dh| + |dk| + |dl| <= 8 Mean |L| :0.471 (untwinned: 0.500; perfect twin: 0.375) Mean L^2 :0.301 (untwinned: 0.333; perfect twin: 0.200) The distribution of |L| values indicates a twin fraction of 0.00. Note that this estimate is not as reliable as obtained via a Britton plot or H-test if twin laws are available. Reference: J. Padilla & T. O. Yeates. A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. Acta Crystallogr. D59, 1124-30, 2003. ================================== Twin laws ================================== ----------Twin law identification---------- No twin laws are possible for this crystal lattice. ================== Twinning and intensity statistics summary ================== ----------Final verdict---------- The largest off-origin peak in the Patterson function is 3.07% of the height of the origin peak. No significant pseudotranslation is detected. The results of the L-test indicate that the intensity statistics behave as expected. No twinning is suspected. ----------Statistics independent of twin laws---------- <I^2>/<I>^2 : 1.935 (untwinned: 2.0, perfect twin: 1.5) <F>^2/<F^2> : 0.805 (untwinned: 0.785, perfect twin: 0.885) <|E^2-1|> : 0.696 (untwinned: 0.736, perfect twin: 0.541) <|L|> : 0.471 (untwinned: 0.500; perfect twin: 0.375) <L^2> : 0.301 (untwinned: 0.333; perfect twin: 0.200) Multivariate Z score L-test: 1.292 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. Note that the expected values for perfect twinning are for merohedrally twinned structures, and deviations from untwinned will be larger for perfect higher-order twinning. No (pseudo)merohedral twin laws were found. Data processed P21, amplitudes as input: =============== Diagnostic tests for twinning and pseudosymmetry ============== Using data between 10.00 to 3.50 Angstrom. ----------Patterson analyses---------- Largest Patterson peak with length larger than 15 Angstrom: Frac. coord. : 0.162 0.000 -0.020 Distance to origin : 17.554 Height relative to origin : 2.975 % p_value(height) : 1.000e+00 Explanation The p-value, the probability that a peak of the specified height or larger is found in a Patterson function of a macromolecule that does not have any translational pseudo-symmetry, is equal to 1.000e+00. p_values smaller than 0.05 might indicate weak translational pseudo symmetry, or the self vector of a large anomalous scatterer such as Hg, whereas values smaller than 1e-3 are a very strong indication for the presence of translational pseudo symmetry. ----------Wilson ratio and moments---------- Acentric reflections: <I^2>/<I>^2 :1.974 (untwinned: 2.000; perfect twin 1.500) <F>^2/<F^2> :0.816 (untwinned: 0.785; perfect twin 0.885) <|E^2 - 1|> :0.689 (untwinned: 0.736; perfect twin 0.541) Centric reflections: <I^2>/<I>^2 :2.817 (untwinned: 3.000; perfect twin 2.000) <F>^2/<F^2> :0.691 (untwinned: 0.637; perfect twin 0.785) <|E^2 - 1|> :0.832 (untwinned: 0.968; perfect twin 0.736) ----------NZ test for twinning and TNCS---------- The NZ test is diagnostic for both twinning and translational NCS. Note however that if both are present, the effects may cancel each other out, therefore the results of the Patterson analysis and L-test also need to be considered. Maximum deviation acentric : 0.061 Maximum deviation centric : 0.060 <NZ(obs)-NZ(twinned)>_acentric : -0.011 <NZ(obs)-NZ(twinned)>_centric : +0.017 ----------L test for acentric data---------- Using difference vectors (dh,dk,dl) of the form: (2hp, 2kp, 2lp) where hp, kp, and lp are random signed integers such that 2 <= |dh| + |dk| + |dl| <= 8 Mean |L| :0.435 (untwinned: 0.500; perfect twin: 0.375) Mean L^2 :0.262 (untwinned: 0.333; perfect twin: 0.200) The distribution of |L| values indicates a twin fraction of 0.00. Note that this estimate is not as reliable as obtained via a Britton plot or H-test if twin laws are available. Reference: J. Padilla & T. O. Yeates. A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. Acta Crystallogr. D59, 1124-30, 2003. ================================== Twin laws ================================== ----------Twin law identification---------- No twin laws are possible for this crystal lattice. ================== Twinning and intensity statistics summary ================== ----------Final verdict---------- The largest off-origin peak in the Patterson function is 2.98% of the height of the origin peak. No significant pseudotranslation is detected. The results of the L-test indicate that the intensity statistics are significantly different than is expected from good to reasonable, untwinned data. As there are no twin laws possible given the crystal symmetry, there could be a number of reasons for the departure of the intensity statistics from normality. Overmerging pseudo-symmetric or twinned data, intensity to amplitude conversion problems as well as bad data quality might be possible reasons. It could be worthwhile considering reprocessing the data. ----------Statistics independent of twin laws---------- <I^2>/<I>^2 : 1.974 (untwinned: 2.0, perfect twin: 1.5) <F>^2/<F^2> : 0.816 (untwinned: 0.785, perfect twin: 0.885) <|E^2-1|> : 0.689 (untwinned: 0.736, perfect twin: 0.541) <|L|> : 0.435 (untwinned: 0.500; perfect twin: 0.375) <L^2> : 0.262 (untwinned: 0.333; perfect twin: 0.200) Multivariate Z score L-test: 4.774 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. Note that the expected values for perfect twinning are for merohedrally twinned structures, and deviations from untwinned will be larger for perfect higher-order twinning. No (pseudo)merohedral twin laws were found. Data processed as P1, intensities as input: =============== Diagnostic tests for twinning and pseudosymmetry ============== Using data between 10.00 to 3.50 Angstrom. ----------Patterson analyses---------- Largest Patterson peak with length larger than 15 Angstrom: Frac. coord. : 0.109 -0.092 0.018 Distance to origin : 18.636 Height relative to origin : 3.518 % p_value(height) : 9.999e-01 Explanation The p-value, the probability that a peak of the specified height or larger is found in a Patterson function of a macromolecule that does not have any translational pseudo-symmetry, is equal to 9.999e-01. p_values smaller than 0.05 might indicate weak translational pseudo symmetry, or the self vector of a large anomalous scatterer such as Hg, whereas values smaller than 1e-3 are a very strong indication for the presence of translational pseudo symmetry. ----------Wilson ratio and moments---------- Acentric reflections: <I^2>/<I>^2 :1.916 (untwinned: 2.000; perfect twin 1.500) <F>^2/<F^2> :0.809 (untwinned: 0.785; perfect twin 0.885) <|E^2 - 1|> :0.704 (untwinned: 0.736; perfect twin 0.541) ----------NZ test for twinning and TNCS---------- The NZ test is diagnostic for both twinning and translational NCS. Note however that if both are present, the effects may cancel each other out, therefore the results of the Patterson analysis and L-test also need to be considered. Maximum deviation acentric : 0.043 Maximum deviation centric : 0.683 <NZ(obs)-NZ(twinned)>_acentric : -0.026 <NZ(obs)-NZ(twinned)>_centric : -0.467 ----------L test for acentric data---------- Using difference vectors (dh,dk,dl) of the form: (2hp, 2kp, 2lp) where hp, kp, and lp are random signed integers such that 2 <= |dh| + |dk| + |dl| <= 8 Mean |L| :0.466 (untwinned: 0.500; perfect twin: 0.375) Mean L^2 :0.296 (untwinned: 0.333; perfect twin: 0.200) The distribution of |L| values indicates a twin fraction of 0.00. Note that this estimate is not as reliable as obtained via a Britton plot or H-test if twin laws are available. Reference: J. Padilla & T. O. Yeates. A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. Acta Crystallogr. D59, 1124-30, 2003. ================================== Twin laws ================================== ----------Twin law identification---------- Possible twin laws: ------------------------------------------------------------------------------- | Type | Axis | R metric (%) | delta (le Page) | delta (Lebedev) | Twin law | ------------------------------------------------------------------------------- | PM | 2-fold | 0.053 | 0.035 | 0.000 | -h,-k,l | ------------------------------------------------------------------------------- 0 merohedral twin operators found 1 pseudo-merohedral twin operators found In total, 1 twin operators were found Please note that the possibility of twin laws only means that the lattice symmetry permits twinning; it does not mean that the data are actually twinned. You should only treat the data as twinned if the intensity statistics are abnormal. ----------Twin law-specific tests---------- The following tests analyze the input data with each of the possible twin laws applied. If twinning is present, the most appropriate twin law will usually have a low R_abs_twin value and a consistent estimate of the twin fraction (significantly above 0) from each test. The results are also compiled in the summary section. WARNING: please remember that the possibility of twin laws, and the results of the specific tests, does not guarantee that twinning is actually present in the data. Only the presence of abnormal intensity statistics (as judged by the Wilson moments, NZ-test, and L-test) is diagnostic for twinning. ----------Analysis of twin law -h,-k,l---------- H-test on acentric data Only 50.0 % of the strongest twin pairs were used. mean |H| : 0.239 (0.50: untwinned; 0.0: 50% twinned) mean H^2 : 0.116 (0.33: untwinned; 0.0: 50% twinned) Estimation of twin fraction via mean |H|: 0.261 Estimation of twin fraction via cum. dist. of H: 0.278 Britton analyses Extrapolation performed on 0.45 < alpha < 0.495 Estimated twin fraction: 0.337 Correlation: 0.9951 R vs R statistics R_abs_twin = <|I1-I2|>/<|I1+I2|> (Lebedev, Vagin, Murshudov. Acta Cryst. (2006). D62, 83-95) R_abs_twin observed data : 0.236 R_sq_twin = <(I1-I2)^2>/<(I1+I2)^2> R_sq_twin observed data : 0.096 No calculated data available. R_twin for calculated data not determined. ======================= Exploring higher metric symmetry ====================== The point group of data as dictated by the space group is P 1 The point group in the niggli setting is P 1 The point group of the lattice is P 1 1 2 A summary of R values for various possible point groups follow. ---------------------------------------------------------------------------------------------- | Point group | mean R_used | max R_used | mean R_unused | min R_unused | BIC | choice | ---------------------------------------------------------------------------------------------- | P 1 | None | None | 0.236 | 0.236 | 5.792e+05 | | | P 1 1 2 | 0.236 | 0.236 | None | None | 3.867e+05 | | ---------------------------------------------------------------------------------------------- R_used: mean and maximum R value for symmetry operators *used* in this point group R_unused: mean and minimum R value for symmetry operators *not used* in this point group An automated point group suggestion is made on the basis of the BIC (Bayesian information criterion). The likely point group of the data is: P 1 1 2 Possible space groups in this point group are: Unit cell: (103.91, 197.01, 137.2, 90, 99.873, 90) Space group: P 1 2 1 (No. 3) Unit cell: (103.91, 197.01, 137.2, 90, 99.873, 90) Space group: P 1 21 1 (No. 4) Note that this analysis does not take into account the effects of twinning. If the data are (almost) perfectly twinned, the symmetry will appear to be higher than it actually is. ================== Twinning and intensity statistics summary ================== ----------Final verdict---------- The largest off-origin peak in the Patterson function is 3.52% of the height of the origin peak. No significant pseudotranslation is detected. The results of the L-test indicate that the intensity statistics behave as expected. No twinning is suspected. The symmetry of the lattice and intensity however suggests that the input input space group is too low. See the relevant sections of the log file for more details on your choice of space groups. As the symmetry is suspected to be incorrect, it is advisable to reconsider data processing. ----------Statistics independent of twin laws---------- <I^2>/<I>^2 : 1.916 (untwinned: 2.0, perfect twin: 1.5) <F>^2/<F^2> : 0.809 (untwinned: 0.785, perfect twin: 0.885) <|E^2-1|> : 0.704 (untwinned: 0.736, perfect twin: 0.541) <|L|> : 0.466 (untwinned: 0.500; perfect twin: 0.375) <L^2> : 0.296 (untwinned: 0.333; perfect twin: 0.200) Multivariate Z score L-test: 1.670 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. Note that the expected values for perfect twinning are for merohedrally twinned structures, and deviations from untwinned will be larger for perfect higher-order twinning. ----------Statistics depending on twin laws---------- ----------------------------------------------------------------- | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | ----------------------------------------------------------------- | -h,-k,l | PM | 0.236 | 0.337 | 0.278 | 0.348 | ———————————————————————————————— Data processed as P1, amplitudes as input: =============== Diagnostic tests for twinning and pseudosymmetry ============== Using data between 10.00 to 3.50 Angstrom. ----------Patterson analyses---------- Largest Patterson peak with length larger than 15 Angstrom: Frac. coord. : 0.109 -0.091 0.018 Distance to origin : 18.517 Height relative to origin : 3.198 % p_value(height) : 1.000e+00 Explanation The p-value, the probability that a peak of the specified height or larger is found in a Patterson function of a macromolecule that does not have any translational pseudo-symmetry, is equal to 1.000e+00. p_values smaller than 0.05 might indicate weak translational pseudo symmetry, or the self vector of a large anomalous scatterer such as Hg, whereas values smaller than 1e-3 are a very strong indication for the presence of translational pseudo symmetry. ----------Wilson ratio and moments---------- Acentric reflections: <I^2>/<I>^2 :1.967 (untwinned: 2.000; perfect twin 1.500) <F>^2/<F^2> :0.821 (untwinned: 0.785; perfect twin 0.885) <|E^2 - 1|> :0.690 (untwinned: 0.736; perfect twin 0.541) ----------NZ test for twinning and TNCS---------- The NZ test is diagnostic for both twinning and translational NCS. Note however that if both are present, the effects may cancel each other out, therefore the results of the Patterson analysis and L-test also need to be considered. Maximum deviation acentric : 0.077 Maximum deviation centric : 0.683 <NZ(obs)-NZ(twinned)>_acentric : -0.022 <NZ(obs)-NZ(twinned)>_centric : -0.467 ----------L test for acentric data---------- Using difference vectors (dh,dk,dl) of the form: (2hp, 2kp, 2lp) where hp, kp, and lp are random signed integers such that 2 <= |dh| + |dk| + |dl| <= 8 Mean |L| :0.427 (untwinned: 0.500; perfect twin: 0.375) Mean L^2 :0.254 (untwinned: 0.333; perfect twin: 0.200) The distribution of |L| values indicates a twin fraction of 0.00. Note that this estimate is not as reliable as obtained via a Britton plot or H-test if twin laws are available. Reference: J. Padilla & T. O. Yeates. A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. Acta Crystallogr. D59, 1124-30, 2003. ================================== Twin laws ================================== ----------Twin law identification---------- Possible twin laws: ------------------------------------------------------------------------------- | Type | Axis | R metric (%) | delta (le Page) | delta (Lebedev) | Twin law | ------------------------------------------------------------------------------- | PM | 2-fold | 0.053 | 0.035 | 0.000 | -h,-k,l | ------------------------------------------------------------------------------- 0 merohedral twin operators found 1 pseudo-merohedral twin operators found In total, 1 twin operators were found Please note that the possibility of twin laws only means that the lattice symmetry permits twinning; it does not mean that the data are actually twinned. You should only treat the data as twinned if the intensity statistics are abnormal. ----------Twin law-specific tests---------- The following tests analyze the input data with each of the possible twin laws applied. If twinning is present, the most appropriate twin law will usually have a low R_abs_twin value and a consistent estimate of the twin fraction (significantly above 0) from each test. The results are also compiled in the summary section. WARNING: please remember that the possibility of twin laws, and the results of the specific tests, does not guarantee that twinning is actually present in the data. Only the presence of abnormal intensity statistics (as judged by the Wilson moments, NZ-test, and L-test) is diagnostic for twinning. ----------Analysis of twin law -h,-k,l---------- H-test on acentric data Only 50.0 % of the strongest twin pairs were used. mean |H| : 0.213 (0.50: untwinned; 0.0: 50% twinned) mean H^2 : 0.082 (0.33: untwinned; 0.0: 50% twinned) Estimation of twin fraction via mean |H|: 0.287 Estimation of twin fraction via cum. dist. of H: 0.288 Britton analyses Extrapolation performed on 0.44 < alpha < 0.495 Estimated twin fraction: 0.337 Correlation: 0.9956 R vs R statistics R_abs_twin = <|I1-I2|>/<|I1+I2|> (Lebedev, Vagin, Murshudov. Acta Cryst. (2006). D62, 83-95) R_abs_twin observed data : 0.219 R_sq_twin = <(I1-I2)^2>/<(I1+I2)^2> R_sq_twin observed data : 0.093 No calculated data available. R_twin for calculated data not determined. ======================= Exploring higher metric symmetry ====================== The point group of data as dictated by the space group is P 1 The point group in the niggli setting is P 1 The point group of the lattice is P 1 1 2 A summary of R values for various possible point groups follow. ---------------------------------------------------------------------------------------------- | Point group | mean R_used | max R_used | mean R_unused | min R_unused | BIC | choice | ---------------------------------------------------------------------------------------------- | P 1 | None | None | 0.219 | 0.219 | 3.436e+05 | | | P 1 1 2 | 0.219 | 0.219 | None | None | 2.188e+05 | | ---------------------------------------------------------------------------------------------- R_used: mean and maximum R value for symmetry operators *used* in this point group R_unused: mean and minimum R value for symmetry operators *not used* in this point group An automated point group suggestion is made on the basis of the BIC (Bayesian information criterion). The likely point group of the data is: P 1 1 2 Possible space groups in this point group are: Unit cell: (103.91, 197.01, 137.2, 90, 99.873, 90) Space group: P 1 2 1 (No. 3) Unit cell: (103.91, 197.01, 137.2, 90, 99.873, 90) Space group: P 1 21 1 (No. 4) Note that this analysis does not take into account the effects of twinning. If the data are (almost) perfectly twinned, the symmetry will appear to be higher than it actually is. ================== Twinning and intensity statistics summary ================== ----------Final verdict---------- The largest off-origin peak in the Patterson function is 3.20% of the height of the origin peak. No significant pseudotranslation is detected. The results of the L-test indicate that the intensity statistics are significantly different than is expected from good to reasonable, untwinned data. As there are twin laws possible given the crystal symmetry, twinning could be the reason for the departure of the intensity statistics from normality. It might be worthwhile carrying out refinement with a twin specific target function. Please note however that R-factors from twinned refinement cannot be directly compared to R-factors without twinning, as they will always be lower when a twin law is used. You should also use caution when interpreting the maps from refinement, as they will have significantly more model bias. Note that the symmetry of the intensities suggest that the assumed space group is too low. As twinning is however suspected, it is not immediately clear if this is the case. Careful reprocessing and (twin)refinement for all cases might resolve this question. ----------Statistics independent of twin laws---------- <I^2>/<I>^2 : 1.967 (untwinned: 2.0, perfect twin: 1.5) <F>^2/<F^2> : 0.821 (untwinned: 0.785, perfect twin: 0.885) <|E^2-1|> : 0.690 (untwinned: 0.736, perfect twin: 0.541) <|L|> : 0.427 (untwinned: 0.500; perfect twin: 0.375) <L^2> : 0.254 (untwinned: 0.333; perfect twin: 0.200) Multivariate Z score L-test: 5.697 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. Note that the expected values for perfect twinning are for merohedrally twinned structures, and deviations from untwinned will be larger for perfect higher-order twinning. ----------Statistics depending on twin laws---------- ----------------------------------------------------------------- | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | ----------------------------------------------------------------- | -h,-k,l | PM | 0.219 | 0.337 | 0.288 | 0.307 | -----------------------------------------------------------------