***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***



Dear Carlos,

there is an inverse mathematical relationship between Rrim (or Rmeas),
and I/sd(I), namely Rrim = 0.8 * sd(I)/I

The 0.8 comes from sqrt(2/pi).

Thus, if you have high enough redundancy, an Rmerge of 40% for the
resolution shell where I/sd(I) = 2.0 is perfectly aceptable, because
it just describes the statistical errors. Any other errors influencing
the measurements may easily increase the Rmerge. I would even go as
far as saying that Rmerge-values lower than that would point ot
an incorrect estimation of the standard deviations.

Maybe you would like to point this out when you discuss the
values with the reviewers.

Best wishes,

Manfred.


********************************************************************
*                                                                  *
*                    Dr. Manfred S. Weiss                          *
*                                                                  *
*                         Team Leader                              *
*                                                                  *
* EMBL Hamburg Outstation                    Fon: +49-40-89902-170 *
* c/o DESY, Notkestr. 85                     Fax: +49-40-89902-149 *
* D-22603 Hamburg                   Email: [EMAIL PROTECTED] *
* GERMANY                       Web: www.embl-hamburg.de/~msweiss/ *
*                                                                  *
********************************************************************


On Thu, 27 Apr 2006, Carlos Frazao wrote:

> ***  For details on how to be removed from this list visit the  ***
> ***          CCP4 home page http://www.ccp4.ac.uk         ***
>
>
> When this thread began a few weeks ago I sent a reply to the bulletin
> basically saying that both Rmerge (or Rsym) and Rmeas (or Rrim) are
> USELESS for the purpose of establishing/discussing data-resolution
> cut-offs. Although they reflect "qualities" of the original individual
> measured intensities, we should be interested instead on the "quality"
> of the final averaged data. These depends largely on its redundancy, and
> therefore one should rely only on an indicators that takes redundancy in
> consideration. The only indicators I am aware of in this field is
> I/sigma(I) and Rpim.
>
> Now, I am also having difficulties with a referee who has objected to my
> resolution cut-offs (where Rmerge came to 60%) although I/sigma(I) =3
> and Rpim=40%. I believe that these two last indicators correlate
> reasonably, as I was expecting a precision of 33% (Rpim) when
> I/sima(I)=3. I also checked the Wilson plot diagram, where at the
> resolution limit it seems as "linear" as at lower resolution ranges and
> with similar inclination.
>
> I understand that in earlier days, when data were much-much harder to
> collect, crystallographers were very happy with complete (although
> unredundant) data set. Then,  Rmerge(or Rsym) were perfectly adequated
> in pratical terms, as data collection expriments were designed
> essentially to assure that at the end of the week data were complete -
> it was not pratical to collect redundant data. Today, with ccd's and
> synchrotron radiation, we quite often try to obtain data as redundant as
> possible, in particular to detect anomalous signal, and Rmerge
> statistics are simply not applicable anymore.
>
> Could anyone comment on these ideas (as I recognise that I am certainly
> not a statistics expert).
>
> Thanks
> Carlos
>
>
> Eleanor Dodson wrote:
>
> > To add a little to James's excellent summary.
> > As reviewers I think we should always question results where the
> > I/SigI is > 2-3 in the outer shell. Authors should at least be asked
> > to justify why they have not cellected the best available experimental
> > data.
> > Ditto if Rfree is too low for the resolution ( eg differing by < 5% at
> > 2.8A) the authors should be challenged - there are many ways of
> > underestimated your Rfree - all of which compromise the maximum
> > likelihood refinement, but they should be deprecated!
> >
> > To finish with a question that always puzzles me - why do structures
> > which generate very similar quality maps at similar resolutions have
> > such different Rfactor profiles. I have seen lovely final maps at 2A
> > with R< 18% etc, and also lovely final maps at 2A with Rfactors ~
> > 24%...  It might be a radiation damage phenonoma I guess.
> >   Eleanor
> >
> >
> > James Holton wrote:
> >
> >> ***  For details on how to be removed from this list visit the  ***
> >> ***          CCP4 home page http://www.ccp4.ac.uk         ***
> >>
> >>
> >>
> >> Well, since I was mentioned by name. I suppose I should put my two
> >> cents in:
> >>
> >> Rmerge is NOT a good way to judge your last resolution shell!
> >>
> >> My advice if you are faced with a reviewer who complains your Rmerge
> >> is to high is to change the name to Rsym.  This is actually the
> >> appropriate name for the statistic you are quoting. Rmerge
> >> (traditionally) refers to the R factor of combining data from two
> >> crystals.  Rsym refers to the agreement between symmetry mates after
> >> scaling.
> >>
> >> Rsym (and Rmerge) used to be useful things to quote back when people
> >> applied a 3-sigma cutoff to their raw observation data.  Seems like a
> >> borderline criminal thing to do nowadays (and it is), but in the dark
> >> ages before maximum likelihood the only way to keep a least-squares
> >> refinement package from chasing noise was to make sure you didn't
> >> confuse it with a ton of weak (noisy) data. All "R" statistics are
> >> supposed to be measuring one type of error (R is for residual).
> >> Rmerge is supposed to measure non-isomorphism.  Rsym is supposed to
> >> measure deviation from true symmetry.  Rcryst and Rfree measure the
> >> "incorrectness" of your model.
> >> The absolute value of "R" statistics is only meaningful if you can
> >> normalize out the contribution of other sources of error.  Weak data
> >> have more random noise than strong data, and the more high-resolution
> >> data you include, the more weak data you will have.  Applying a
> >> 3-sigma cutoff eliminates any spots measured with more than ~33%
> >> error (if you believe your sigmas). The remaining strong spots have
> >> relatively little random error (from counting statistics), so the
> >> 3-sigma cutoff tends to "normalize" data collected from one crystal
> >> or another.  However, if you apply a 3-sigma cutoff, you will have
> >> less and less spots as you get out to high resolution.  This is why
> >> "completeness" became a criterion for the high-resolution limit.
> >>
> >> Anyway, in sumary: I say don't worry about your Rmerge in the high
> >> resolution shell.  I/sd is much more meaningful.  Just be careful to
> >> optimize your error model (SDCORR in scala, error_scale_factor and
> >> estimated_error in scalepack) so that your scatter/sigma values in
> >> the scala log are close to one (or the final "Chi^2" in scalepack).
> >> As for what I/sd you should cut off your data?  I use I/sd of 1.5.
> >> Mainly because it is a "compromise" between 1.0 (signal = noise) and
> >> 2.0 (signal = 2x noise).
> >>
> >> As a comment: I fear that the recent rash of structures with I/sd of
> >> 6 or 8 in the outer resolution shell is happening because Rfree is
> >> also subject to the unfortunate feature of "R" statistics mentioned
> >> above: you get a lower Rcryst and Rfree if you are willing to
> >> sacrifice a little "resolution".  I guess it is just too tempting to
> >> play with your resolution limit when you run out of model building
> >> ideas and your Rfree is still too high.  This is a BAD BAD thing to
> >> do.  BAD!!  Better to calculate an Rfree using only data with F/sd >
> >> 3 (note it as such!), and have the decency to deposit all your
> >> structure factors.
> >> -James Holton
> >> MAD Scientist
> >>
> >>
> >> Bart Hazes wrote:
> >>
> >>> Hi Ashima,
> >>>
> >>> With these statistics you shouldn't have to worry about reviewers,
> >>> it looks perfectly sensible. Actually I'm much more concerned about
> >>> the recent epidemic of overly pessimistic resolution cutoffs. In our
> >>> journal club at least half the papers have I/SigI in the highest
> >>> resolution bin in the 3-6 range which means they could have gotten
> >>> significantly higher resolution. There are situations where data
> >>> quality is more important than resolution, for instance (anomalous)
> >>> phasing, but I see the same with many native data sets.
> >>>
> >>> It is not clear to me if people are placing the detector too far
> >>> from the crystal and thus not even measure the highest resolution
> >>> data or that they just elect not to process those data. Why??? To
> >>> get nicer looking statistics???? That would be VERY bad practice!!!
> >>>
> >>> A kinder view is that the detector distance is set based on the
> >>> apparent resolution of the first image(s) which underestimates the
> >>> true resolution of a high redundancy data set. If you don't need a
> >>> long detector distance to resolve spots I prefer to select a
> >>> distance where my visible diffraction uses the central 80-90% of the
> >>> detector allowing mosflm to try to extract some sensible information
> >>> from beyond what the eye can see.
> >>>
> >>> This looks like something James Holton may have looked at. If so I'd
> >>> be interested to hear if he or the elves have come up with a magic
> >>> rule.
> >>>
> >>> Bart
> >>>
> >>> Ashima Bagaria wrote:
> >>>
> >>>> ***  For details on how to be removed from this list visit the  ***
> >>>> ***          CCP4 home page http://www.ccp4.ac.uk         ***
> >>>>
> >>>>
> >>>>
> >>>> HI all,
> >>>> In regards to my CCP4 question about the acceptable Rmerge values
> >>>> in last resolution shell..various other parameters pertaining to
> >>>> the protein data at 3.5 A are
> >>>>
> >>>> I/sigmaI = 13.1 (2.3)
> >>>> %completeness = 95.7(96.8)
> >>>> multiplicity = 3.8
> >>>>
> >>>> All suggestions are welcome
> >>>>
> >>>> Regards
> >>>> ashima
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >>
> >
> >
> >
> > !DSPAM:44508b8791222580011916!
>
>
> --
> **************************************
> Dr. Carlos Frazao
> Crystallography Department
> ITQB-UNL, Av Republica, Apartado 127
> 2781-901 Oeiras, Portugal
>
> Phone:  (351)-214469666
> FAX:    (351)-214433644
> e-mail: [EMAIL PROTECTED]
>         www.itqb.unl.pt
>
>

Reply via email to