Re: [ccp4bb] Refinement against frames

Boaz Shaanan Tue, 25 Jun 2013 05:10:08 -0700

Dear Loes,

Thanks for the message. To the best of my recollection (I actually come from 
small molecules crystallography) the problems of small molecule 
crystallographers when it comes to studying accurate e.d.'s (e.g. bond 
densities and such) have mostly to do with separating the effect of atomic 
thermal motion and true residual bond densities, i.e. mostly issues of 
modelling the thermal motion. TDS is a pain for small molecule crystallography 
and protein crystallographers. It's reminiscent of  the British weather - 
everybody complains about it but nobody does anything about it. Do small 
molecule crystallographers model TDS properly and correct the data for it 
nowadays in studies of accurate e.d.?


Modelling the thermal motion in proteins by B-factors is known to be a gross 
over-simplification because of many reasons, some of which you mentioned. TDS 
is another issue. There have been attempts in the past by several groups to 
deal with TDS in protein crystals but I'm not sure the community was convinced 
that it lead to improvement of the data. Whether TDS is the main culprit for 
the relatively high R factor of protein structures (that is relative to small 
molecules) is not clear. Modelling TDS (both the parts that arise from protein 
dynamics and crystal disorder) in protein data,  in order to improve our data 
and the resulting atomic models is a good thing.  Why should that logically 
lead to refinement against frames once the TDS has been modelled properly and 
the data corrected accordingly (future tense should be used here, actually), is 
not clear to me. I would think that working on one (or a few) data sets that 
suffer from severe TDS, correcting the data, and re-refining the models to see 
what difference it makes would be a good starting point. 

   Cheers,

               Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710





________________________________________
From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Loes 
Kroon-Batenburg [l.m.j.kroon-batenb...@uu.nl]
Sent: Tuesday, June 25, 2013 1:09 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Refinement against frames

Dear Boaz,

Indeed, small molecule crystallographers are routinely converting pixels
into I's and can refine structures to very low R-values, but only to a
limited resolution. The Bragg intensities are very strong, and
background scattering stays almost unnoticed. Once they start studying
accurate electron densities the flaws in the models (Icalc) become
apparent.
However, protein crystals are different: they have large disordered
solvent regions, disorder in the proteins conformations, and background
scattering of the mother liquor/air/crystal mount that may be even
stronger than the many weak intensities. The disorder of the protein
will lead to incoherent scattering that also produces significant
background scattering, which at moderate B-factors  may make up half of
the total scattering.  Converting pixel intensities into I_bragg (after
subtracting some background) and refining against those (or F's) is
clearly a simplification, and only gives us the average structure and
not the true structure. The disorder may also lead to more structured
non-Bragg scattering, which we call diffuse scattering, indicating that
our crystal is in fact not periodic. Understanding what is really going
on in our crystal, and trying to model the observed raw diffraction
patterns is in fact very interesting, may solve the problems of trying
to convert I's to F's, may give a better estimate of the 'average'
structures and tell us how the protein molecules are really behaving (in
the crystal).
Trying to model diffraction images comes with lots of additional
problems, because instrumental characteristics have also to be modeled.
However, it is a very interesting route to go.
There may be a moment in future where we think we can do this. It would
be good if than we would have raw images available of all those weirdly
diffracting crystals, that we managed in some way or another to extract
I_bragg (or Ispot-Iback) from.

Greetings,
Loes.

On 06/24/13 14:21, Boaz Shaanan wrote:
> Hi Tim,
>
> I agree with you.  Another point to remember about this issue of pixel->F's  
> (or I's) conversion is that small molecule crystallographers take the same 
> route and produce structures with 1-2% R-factors, so this conversion is 
> hardly our problem. The main culprit in the issues that have been discussed 
> so lucidly on the BB recently have mostly to do with the vast amount of weak 
> reflections in diffraction patterns of macromolecules (and how to decide on 
> resolution in such situations). Digging into the peak/background pixels and 
> signal/noise ratio there is just going to open another Pandora box.
>
> My 2p thoughts.
>
>           Cheers,
>
>                   Boaz
>
>
> Boaz Shaanan, Ph.D.
> Dept. of Life Sciences
> Ben-Gurion University of the Negev
> Beer-Sheva 84105
> Israel
>
> E-mail: bshaa...@bgu.ac.il
> Phone: 972-8-647-2220  Skype: boaz.shaanan
> Fax:   972-8-647-2992 or 972-8-646-1710
>
>
>
>
>
> ________________________________________
> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Tim Gruene 
> [t...@shelx.uni-ac.gwdg.de]
> Sent: Monday, June 24, 2013 2:59 PM
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Refinement against frames
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dear John,
>
> actually I am not a friend of this idea. Processing software make an
> excellent job of removing the instrumental part from our data. If we
> start to integrate against frames, the next structural title might be
> something like "Crystal structure of ABC a xA resolution measured at
> beamline xyz with a frame width of f degrees and a total rotation
> range of phi degreees..." the point I am trying to make: once
> integrating against frames one may have to take a lot of issues into
> account for interpreting the structure.
> And do you think that refining against frames will actually give
> greater chemical or biological insight into the sample, or will it
> only give a more accurate description of the crystal contents? These
> are two different things and the latter is - in my opinion - not what
> structures are about.
>
> Best, Tim
>
> P.S.: I changed the subject line, because the thread based sorting of
> my emails is soon going to exceed the width of my screem for the
> original one.
>
> On 06/24/2013 08:13 AM, Jrh wrote:
>> Dear Tom, I find this suggestion of using the full images an
>> excellent and visionary one. So, how to implement it? We are part
>> way along the path with James Holton's reverse Mosflm. The computer
>> memory challenge could be ameliorated by simple pixel averaging at
>> least initially. The diffuse scattering would be the ultimate gold
>> at the end of the rainbow. Peter Moore's new book, inter alia,
>> carries many splendid insights into the diffuse scattering in our
>> diffraction patterns. Fullprof analyses have become a firm trend in
>> other fields, admittedly with simpler computing overheads.
>> Greetings, John
>>
>> Prof John R Helliwell DSc FInstP
>>
>>
>>
>> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
>> <terwilli...@lanl.gov>  wrote:
>>
>>> I hope I am not duplicating too much of this fascinating
>>> discussion with these comments:  perhaps the main reason there is
>>> confusion about what to do is that neither F nor I is really the
>>> most suitable thing to use in refinement.  As pointed out several
>>> times in different ways, we don't measure F or I, we only measure
>>> counts on a detector.  As a convenience, we "process" our
>>> diffraction images to estimate I or F and their uncertainties and
>>> model these uncertainties as simple functions (e.g., a Gaussian).
>>> There is no need in principle to do that, and if we were to
>>> refine instead against the raw image data these issues about
>>> positivity would disappear and our structures might even be a
>>> little better.
>>>
>>> Our standard procedure is to estimate F or I from counts on the
>>> detector, then to use these estimates of F or I in refinement.
>>> This is not so easy to do right because F or I contain many terms
>>> coming from many pixels and it is hard to model their statistics
>>> in detail.  Further, attempts we make to estimate either F or I
>>> as physically plausible values (e.g., using the fact that they
>>> are not negative) will generally be biased (the values after
>>> correction will generally be systematically low or systematically
>>> high, as is true for the French and Wilson correction and as
>>> would be true for the truncation of I at zero or above).
>>>
>>> Randy's method for intensity refinement is an improvement because
>>> the statistics are treated more fully than just using an estimate
>>> of F or I and assuming its uncertainty has a simple distribution.
>>> So why not avoid all the problems with modeling the statistics of
>>> processed data and instead refine against the raw data.  From the
>>> structural model you calculate F, from F and a detailed model of
>>> the experiment (the same model that is currently used in data
>>> processing) you calculate the counts expected on each pixel. Then
>>> you calculate the likelihood of the data given your models of the
>>> structure and of the experiment.  This would have lots of
>>> benefits because it would allow improved descriptions of the
>>> experiment (decay, absorption, detector sensitivity, diffuse
>>> scattering and other "background" on the images,....on and on)
>>> that could lead to more accurate structures in the end.  Of
>>> course there are some minor issues about putting all this in
>>> computer memory for refinement....
>>>
>>> -Tom T ________________________________________ From: CCP4
>>> bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Phil
>>> [p...@mrc-lmb.cam.ac.uk] Sent: Friday, June 21, 2013 2:50 PM To:
>>> CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] ctruncate bug?
>>>
>>> However you decide to argue the point, you must consider _all_
>>> the observations of a reflection (replicates and symmetry
>>> related) together when you infer Itrue or F etc, otherwise you
>>> will bias the result even more. Thus you cannot (easily) do it
>>> during integration
>>>
>>> Phil
>>>
>>> Sent from my iPad
>>>
>>> On 21 Jun 2013, at 20:30, Douglas Theobald
>>> <dtheob...@brandeis.edu>  wrote:
>>>
>>>> On Jun 21, 2013, at 2:48 PM, Ed Pozharski
>>>> <epozh...@umaryland.edu>  wrote:
>>>>
>>>>> Douglas,
>>>>>>> Observed intensities are the best estimates that we can
>>>>>>> come up with in an experiment.
>>>>>> I also agree with this, and this is the clincher.  You are
>>>>>> arguing that Ispot-Iback=Iobs is the best estimate we can
>>>>>> come up with.  I claim that is absurd.  How are you
>>>>>> quantifying "best"?  Usually we have some sort of
>>>>>> discrepancy measure between true and estimate, like RMSD,
>>>>>> mean absolute distance, log distance, or somesuch.  Here is
>>>>>> the important point --- by any measure of discrepancy you
>>>>>> care to use, the person who estimates Iobs as 0 when
>>>>>> Iback>Ispot will *always*, in *every case*, beat the person
>>>>>> who estimates Iobs with a negative value.   This is an
>>>>>> indisputable fact.
>>>>> First off, you may find it useful to avoid such words as
>>>>> absurd and indisputable fact.  I know political correctness
>>>>> may be sometimes overrated, but if you actually plan to have
>>>>> meaningful discussion, let's assume that everyone responding
>>>>> to your posts is just trying to help figure this out.
>>>> I apologize for offending and using the strong words --- my
>>>> intention was not to offend.  This is just how I talk when
>>>> brainstorming with my colleagues around a blackboard, but of
>>>> course then you can see that I smile when I say it.
>>>>
>>>>> To address your point, you are right that J=0 is closer to
>>>>> "true intensity" then a negative value.  The problem is that
>>>>> we are not after a single intensity, but rather all of them,
>>>>> as they all contribute to electron density reconstruction.
>>>>> If you replace negative Iobs with E(J), you would
>>>>> systematically inflate the averages, which may turn
>>>>> problematic in some cases.
>>>> So, I get the point.  But even then, using any reasonable
>>>> criterion, the whole estimated dataset will be closer to the
>>>> true data if you set all "negative" intensity estimates to 0.
>>>>
>>>>> It is probably better to stick with "raw intensities" and
>>>>> construct theoretical predictions properly to account for
>>>>> their properties.
>>>>>
>>>>> What I was trying to tell you is that observed intensities is
>>>>> what we get from experiment.
>>>> But they are not what you get from the detector.  The detector
>>>> spits out a positive value for what's inside the spot.  It is
>>>> we, as human agents, who later manipulate and massage that data
>>>> value by subtracting the background estimate.  A value that has
>>>> been subjected to a crude background subtraction is not the raw
>>>> experimental value.  It has been modified, and there must be
>>>> some logic to why we massage the data in that particular
>>>> manner.  I agree, of course, that the background should be
>>>> accounted for somehow.  But why just subtract it away?  There
>>>> are other ways to massage the data --- see my other post to
>>>> Ian.  My argument is that however we massage the experimentally
>>>> observed value should be physically informed, and allowing
>>>> negative intensity estimates violates the basic physics.
>>>>
>>>> [snip]
>>>>
>>>>>>> These observed intensities can be negative because while
>>>>>>> their true underlying value is positive, random errorsmay
>>>>>>> result in Iback>Ispot.  There is absolutely nothing
>>>>>>> unphysical here.
>>>>>> Yes there is.  The only way you can get a negative estimate
>>>>>> is to make unphysical assumptions.  Namely, the estimate
>>>>>> Ispot-Iback=Iobs assumes that both the true value of I and
>>>>>> the background noise come from a Gaussian distribution that
>>>>>> is allowed to have negative values.  Both of those
>>>>>> assumptions are unphysical.
>>>>> See, I have a problem with this.  Both common sense and laws
>>>>> of physics dictate that number of photons hitting spot on a
>>>>> detector is a positive number.  There is no law of physics
>>>>> that dictates that under no circumstances there could be
>>>>> Ispot<Iback.
>>>> That's not what I'm saying.  Sure, Ispot can be less than Iback
>>>> randomly.  That does not mean we have to estimate the detected
>>>> intensity as negative, after accounting for background.
>>>>
>>>>> Yes, E(Ispot)>=E(Iback).  Yes, E(Ispot-Iback)>=0.  But
>>>>> P(Ispot-Iback=0)>0, and therefore experimental sampling of
>>>>> Ispot-Iback is bound to occasionally produce negative values.
>>>>> What law of physics is broken when for a given reflection
>>>>> total number of photons in spot pixels is less that total
>>>>> number of photons in equal number of pixels in the
>>>>> surrounding background mask?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Ed.
>>>>>
>>>>> -- Oh, suddenly throwing a giraffe into a volcano to make
>>>>> water is crazy? Julian, King of Lemurs
> - --
> Dr Tim Gruene
> Institut fuer anorganische Chemie
> Tammannstr. 4
> D-37077 Goettingen
>
> GPG Key ID = A46BEE1A
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.12 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iD8DBQFRyDSWUxlJ7aRr7hoRAlmKAKD0MRGp21frbv8LcNG78Y30PPmi9ACdGgR6
> eTfN0+B0XrOgpjIS+wu+KHY=
> =sFxD
> -----END PGP SIGNATURE-----


--

__________________________________________

Dr. Loes Kroon-Batenburg
Dept. of Crystal and Structural Chemistry
Bijvoet Center for Biomolecular Research
Utrecht University
Padualaan 8, 3584 CH Utrecht
The Netherlands

E-mail : l.m.j.kroon-batenb...@uu.nl
phone  : +31-30-2532865
fax    : +31-30-2533940
__________________________________________

Re: [ccp4bb] Refinement against frames

Reply via email to