Re: [ccp4bb] AW: Another troublesome dataset (High Rfree after MR)

Dale Tronrud Mon, 16 Oct 2017 09:12:06 -0700

   Discarding weak data was not the way macromolecular refinement was
done prior to 1990.  Discarding data to lower your R-value is a bad
practice now and was a bad practice back then.  It is my recollection
that some people using X-plor adopted this practice, along with
discarding all low resolution data, but outside of that community these
methods were frowned upon.


   I agree that looking at the agreement between model and data for
subsets of your data is a useful tool for identifying pathologies, but
discarding data in refinement simply because they disagree with your
model is deception.  I know that James is not recommending this, but
that is what some people in that bad period in the 1990's were doing.
Most of us were not!

Dale Tronrud

On 10/16/2017 8:02 AM, James Holton wrote:
> 
> If you suspect that weak data (such as all the spot-free hkls beyond
> your anisotropic resoluiton limits) are driving up your Rwork/Rfree,
> then a good sanity check is to compute "R1".  Most macromolecular
> crystallographers don't know what "R1" is, but it is not only
> commonplace but required in small-molecule crystallography.  All you do
> is throw out all the weak data, say everything with I/sigma < 2 or 3,
> and then re-compute your R factors.  That is, use something like
> "sftools" to select only clearly "observed" reflections, and feed that
> data file back into your refinement program.  In fact, refining only
> against data with I/sigma>3 is the way macromolecular refinement was
> done up until about 1990.  These days, for clarity, you may want to call
> the resulting Rwork/Rfree as R1work and R1free.
> 
> If you do this, and your R1work/R1free are still just as bad as
> Rwork/Rfree, then weak data are not your problem.  You'd be surprised
> how often this is the case.  Next on the list are things like wrong
> symmetry choice, such as twinning masquerading as a symmetry operator,
> or disorder, as in large regions of the molecule that are too fluttery
> to peak above 1 sigma.  The list goes on, but doing the weak-data
> rejection test really helps narrow it down.
> 
> -James Holton
> MAD Scientist
> 
> 
> On 10/16/2017 3:55 AM, herman.schreu...@sanofi.com wrote:
>>
>> Dear Michael,
>>
>>  
>>
>> Did you ask Phaser to check for all possible space groups? There are
>> still I422 and I4 you did not mention. If the space group that came
>> out of Phaser is different from the space group used for processing,
>> subsequent refinement programs may use the wrong space group from the
>> processing. This should be easy to check.
>>
>>  
>>
>> The other suggestion I have is to try a different processing program.
>> Although XDS is excellent, I find that sometimes it has difficulties
>> with ice rings, which reveal themselves not in the processing, but in
>> the subsequent refinement. You may want to try Mosflm or some other
>> processing program.
>>
>>  
>>
>> Best,
>>
>> Herman
>>
>>  
>>
>> *Von:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag
>> von *Michael Jarva
>> *Gesendet:* Sonntag, 15. Oktober 2017 03:09
>> *An:* CCP4BB@JISCMAIL.AC.UK
>> *Betreff:* [EXTERNAL] [ccp4bb] Another troublesome dataset (High Rfree
>> after MR)
>>
>>  
>>
>> To add to the current anisotropic discussion I recently got a dataset
>> I’m unable to refine and I’m hoping I could get some help on figuring
>> out if there’s anything I can do.
>>
>>  
>>
>> I get a clear cut solution with Phaser using the same protein as
>> search model and got a TFZ of >16, LLG >200, and a packing that makes
>> sense, so I don’t doubt the solution. However, the maps look terrible,
>> more like something I would expect from a 3.65Å dataset rather than
>> the 2.65Å it supposedly is.
>>
>>  
>>
>> The dataset merges well in I4122 to 2.65Å with an overall Rmerge of 5%
>> and a CC1/2 of >0.5 in the outer shell (see the bottom for full
>> summary). There is some minor radiation damage but I could cut out
>> most of it due to the high symmetry.
>>
>>  
>>
>> Xtriage reports no indication of twinning, but does say that the data
>> is moderately anisotropic, so I ran the unmerged data through the
>> StarAniso server, which reported the ellipsoidal resolution limits to
>> be 2.304, 2.893, and 3.039. Refining with the anisotropically
>> truncated data improves the maps somewhat, but I am still unable to
>> get the Rfree below 38%. I tried using both phenix.refine and buster
>> with similar results.
>>
>>  
>>
>> I’ve considered the choice of space group and tried I41, F222, I212121
>> , and C2, but with the same results, and Zanuda tells me the same thing.
>>
>> Lastly, there is some minor ice rings, so my last try was to exclud
>> the ice ring resolutions, but this made little to no difference.
>>
>>  
>>
>> Normally I would just write this off as the data being bad but this
>> time all the statistics tell me this should be doable so I’m curious
>> what has gone wrong. 
>>
>>  
>>
>> Cheers
>>
>> Michael Jarva
>>
>>  
>>
>>  
>>
>> Summary data for        Project: XDSproject Crystal: XDScrystal
>> Dataset: XDSdataset
>>  
>>                                            Overall  InnerShell  OuterShell
>> Low resolution limit                       34.87     34.87      2.78
>> High resolution limit                       2.65      8.79      2.65
>>  
>> Rmerge  (within I+/I-)                     0.052     0.026     1.595
>> Rmerge  (all I+ and I-)                    0.057     0.030     1.805
>> Rmeas (within I+/I-)                       0.062     0.031     1.924
>> Rmeas (all I+ & I-)                        0.063     0.033     1.993
>> Rpim (within I+/I-)                        0.032     0.017     1.042
>> Rpim (all I+ & I-)                         0.025     0.014     0.817
>> Rmerge in top intensity bin                0.030        -         -
>> Total number of observations               19931       566      2681
>> Total number unique                         3597       114       471
>> Mean((I)/sd(I))                             11.3      42.3       0.8
>> Mn(I) half-set correlation CC(1/2)         0.999     0.999     0.575
>> Completeness                                97.9      93.1      99.6
>> Multiplicity                                 5.5       5.0       5.7
>>  
>> Anomalous completeness                      92.4      92.1      96.8
>> Anomalous multiplicity                       3.0       3.0       3.0
>> DelAnom correlation between half-sets      0.176     0.258     0.051
>> Mid-Slope of Anom Normal Probability       1.078       -         -  
>>  
>>
>> Average unit cell:   82.39   82.39   69.73   90.00   90.00   90.00
>>
>> Space group: I 41 2 2
>> Average mosaicity:   0.10
>

Re: [ccp4bb] AW: Another troublesome dataset (High Rfree after MR)

Reply via email to