Discarding weak data was not the way macromolecular refinement was done prior to 1990. Discarding data to lower your R-value is a bad practice now and was a bad practice back then. It is my recollection that some people using X-plor adopted this practice, along with discarding all low resolution data, but outside of that community these methods were frowned upon.
I agree that looking at the agreement between model and data for subsets of your data is a useful tool for identifying pathologies, but discarding data in refinement simply because they disagree with your model is deception. I know that James is not recommending this, but that is what some people in that bad period in the 1990's were doing. Most of us were not! Dale Tronrud On 10/16/2017 8:02 AM, James Holton wrote: > > If you suspect that weak data (such as all the spot-free hkls beyond > your anisotropic resoluiton limits) are driving up your Rwork/Rfree, > then a good sanity check is to compute "R1". Most macromolecular > crystallographers don't know what "R1" is, but it is not only > commonplace but required in small-molecule crystallography. All you do > is throw out all the weak data, say everything with I/sigma < 2 or 3, > and then re-compute your R factors. That is, use something like > "sftools" to select only clearly "observed" reflections, and feed that > data file back into your refinement program. In fact, refining only > against data with I/sigma>3 is the way macromolecular refinement was > done up until about 1990. These days, for clarity, you may want to call > the resulting Rwork/Rfree as R1work and R1free. > > If you do this, and your R1work/R1free are still just as bad as > Rwork/Rfree, then weak data are not your problem. You'd be surprised > how often this is the case. Next on the list are things like wrong > symmetry choice, such as twinning masquerading as a symmetry operator, > or disorder, as in large regions of the molecule that are too fluttery > to peak above 1 sigma. The list goes on, but doing the weak-data > rejection test really helps narrow it down. > > -James Holton > MAD Scientist > > > On 10/16/2017 3:55 AM, herman.schreu...@sanofi.com wrote: >> >> Dear Michael, >> >> >> >> Did you ask Phaser to check for all possible space groups? There are >> still I422 and I4 you did not mention. If the space group that came >> out of Phaser is different from the space group used for processing, >> subsequent refinement programs may use the wrong space group from the >> processing. This should be easy to check. >> >> >> >> The other suggestion I have is to try a different processing program. >> Although XDS is excellent, I find that sometimes it has difficulties >> with ice rings, which reveal themselves not in the processing, but in >> the subsequent refinement. You may want to try Mosflm or some other >> processing program. >> >> >> >> Best, >> >> Herman >> >> >> >> *Von:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag >> von *Michael Jarva >> *Gesendet:* Sonntag, 15. Oktober 2017 03:09 >> *An:* CCP4BB@JISCMAIL.AC.UK >> *Betreff:* [EXTERNAL] [ccp4bb] Another troublesome dataset (High Rfree >> after MR) >> >> >> >> To add to the current anisotropic discussion I recently got a dataset >> I’m unable to refine and I’m hoping I could get some help on figuring >> out if there’s anything I can do. >> >> >> >> I get a clear cut solution with Phaser using the same protein as >> search model and got a TFZ of >16, LLG >200, and a packing that makes >> sense, so I don’t doubt the solution. However, the maps look terrible, >> more like something I would expect from a 3.65Å dataset rather than >> the 2.65Å it supposedly is. >> >> >> >> The dataset merges well in I4122 to 2.65Å with an overall Rmerge of 5% >> and a CC1/2 of >0.5 in the outer shell (see the bottom for full >> summary). There is some minor radiation damage but I could cut out >> most of it due to the high symmetry. >> >> >> >> Xtriage reports no indication of twinning, but does say that the data >> is moderately anisotropic, so I ran the unmerged data through the >> StarAniso server, which reported the ellipsoidal resolution limits to >> be 2.304, 2.893, and 3.039. Refining with the anisotropically >> truncated data improves the maps somewhat, but I am still unable to >> get the Rfree below 38%. I tried using both phenix.refine and buster >> with similar results. >> >> >> >> I’ve considered the choice of space group and tried I41, F222, I212121 >> , and C2, but with the same results, and Zanuda tells me the same thing. >> >> Lastly, there is some minor ice rings, so my last try was to exclud >> the ice ring resolutions, but this made little to no difference. >> >> >> >> Normally I would just write this off as the data being bad but this >> time all the statistics tell me this should be doable so I’m curious >> what has gone wrong. >> >> >> >> Cheers >> >> Michael Jarva >> >> >> >> >> >> Summary data for Project: XDSproject Crystal: XDScrystal >> Dataset: XDSdataset >> >> Overall InnerShell OuterShell >> Low resolution limit 34.87 34.87 2.78 >> High resolution limit 2.65 8.79 2.65 >> >> Rmerge (within I+/I-) 0.052 0.026 1.595 >> Rmerge (all I+ and I-) 0.057 0.030 1.805 >> Rmeas (within I+/I-) 0.062 0.031 1.924 >> Rmeas (all I+ & I-) 0.063 0.033 1.993 >> Rpim (within I+/I-) 0.032 0.017 1.042 >> Rpim (all I+ & I-) 0.025 0.014 0.817 >> Rmerge in top intensity bin 0.030 - - >> Total number of observations 19931 566 2681 >> Total number unique 3597 114 471 >> Mean((I)/sd(I)) 11.3 42.3 0.8 >> Mn(I) half-set correlation CC(1/2) 0.999 0.999 0.575 >> Completeness 97.9 93.1 99.6 >> Multiplicity 5.5 5.0 5.7 >> >> Anomalous completeness 92.4 92.1 96.8 >> Anomalous multiplicity 3.0 3.0 3.0 >> DelAnom correlation between half-sets 0.176 0.258 0.051 >> Mid-Slope of Anom Normal Probability 1.078 - - >> >> >> Average unit cell: 82.39 82.39 69.73 90.00 90.00 90.00 >> >> Space group: I 41 2 2 >> Average mosaicity: 0.10 >