Dear Ian,

You still have an arbitrary threshold: at high resolution you see two 
disordered atoms off-axis and at low resolution you see one ordered atom 
on-axis. However, somewhere in between you or the program has to decide whether 
you still see two atoms or if the data (resolution) does not warrant such a 
statement and you switch to the one-atom model. As George Sheldrick confirmed, 
there is a discontinuous transition between the two, which does not correspond 
to the physical reality. There is no "quantum transition" or something when the 
atom get closer than a certain limit to a crystallographic symmetry element. 
The atom does not care, its position is just determined by the local force 
fields and if those force fields have two local minima close together, the atom 
will be disordered.

The decision to switch from a model where the atom is added once with full 
occupancy to the fourier transform calculation, or whether the atom is added 
twice with half occupancy is an arbitrary decision, made by the programmer or 
the user of the program.

Cheers,
Herman  

-----Original Message-----
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Ian Tickle
Sent: Wednesday, December 15, 2010 6:57 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein atoms

That's my whole point, it's not an arbitrary threshold, it's determined 
completely by what the data are capable of telling you about the structure, 
depending on the resolution.  Either you have sufficient resolution to be able 
to say that the atom is disordered off the s.p. or you don't and you have no 
choice but to constrain it to the s.p., whether it's actually disordered or not.

In any case there is no discontinuous change in occupancy at all, I never 
suggested that there should be.  Say at high resolution you see
2 disordered atoms off-axis each with 1/2 occupancy, so total occupancy = 1.  
At lower resolution you see 1 ordered atom on-axis with occupancy 1 - so no 
change in total occupancy.  It makes absolutely no difference if instead if you 
store multiplicity*occupancy in the file, the total occupancy is still 1.
However multiplicity*occupancy is not conserved so will change discontinuously 
(off-axis total = 1, on-axis total = 1/2); occupancy is conserved (it 
represents real atoms after all!).

I have no issue with Shel-X if it's writing out the occupancy for deposition 
(or at least making best efforts to do so).  What it does for intermediate 
files is the user's own data conversion problem if s/he decides to switch 
between different programs.

Cheers

-- Ian

On Wed, Dec 15, 2010 at 5:20 PM, George M. Sheldrick 
<gshe...@shelx.uni-ac.gwdg.de> wrote:
>
> I agree with Herman. It is simply not acceptable to have a sudden 
> discontinuous change in "effective occupancy" at some arbitrary point 
> as a disordered atom approaches a special position. Anyway, whatever 
> the CIF people decide, I will not introduce an incompatibility between 
> different versions of SHELX. When SHELXL produces a small molecule CIF 
> for depostion, it of course attempts to generate the occupancy 
> according to the CIF definition. Not too surprisingly, there are a few 
> complicated cases of 'nearly special positions' where the program gets this 
> wrong.
> This is probably the most serious known 'bug' in SHELXL, but is 
> proving rather difficult to eliminate completely.
>
> George
>
> Prof. George M. Sheldrick FRS
> Dept. Structural Chemistry,
> University of Goettingen,
> Tammannstr. 4,
> D37077 Goettingen, Germany
> Tel. +49-551-39-3021 or -3068
> Fax. +49-551-39-22582
>
>
> On Wed, 15 Dec 2010, Ian Tickle wrote:
>
>> Hi Herman
>>
>> What makes an atom on a special position is that it is literally ON 
>> the s.p.: it can't be 'almost on' the s.p. because then if you tried 
>> to refine the co-ordinates perpendicular to the axis you would find 
>> that the matrix would be singular or at least so badly conditioned 
>> that the solution would be poorly defined.  The only solution to that 
>> problem is to constrain (i.e. fix) these co-ordinates to be exactly 
>> on the axis and not attempt to refine them.  The data are telling you 
>> that you have insufficient resolution so you are not justified in 
>> placing the atom very close to the axis; the best you can do is place 
>> the atom with unit occupancy exactly _on_ the axis.  It's only once 
>> the atom is a 'significant' distance (i.e. relative to the 
>> resolution) away from the axis that these co-ordinates can be 
>> independently refined.  Then the data are telling you that the atom is 
>> disordered.
>> If you collected higher resolution data you might well be able to 
>> detect & successfully refine disordered atoms closer to the axis than 
>> with low resolution data.  So it has nothing to do with the 
>> programmer setting an arbitrary threshold.  This would have to be 
>> some complicated function of atom type, occupancy, B factor, 
>> resolution, data quality etc to work properly anyway so I doubt that 
>> it would be feasible.  Instead it's determined completely by what the 
>> data are capable of telling you about the structure, as indeed it should be.
>>
>> My main concern was the conflict between some program implementations 
>> and the PDB and mmCIF format descriptions on this issue.  For example 
>> the PDB documentation says that the ATOM record contains the 
>> occupancy (where this is defined in the CIF/mmCIF documentation).  If 
>> it had intended that it should contain multiplicity*occupancy instead 
>> then presumably it would have said so.
>>
>> Cheers
>>
>> -- Ian
>>
>> On Wed, Dec 15, 2010 at 4:01 PM,  <herman.schreu...@sanofi-aventis.com> 
>> wrote:
>> > Dear Ian,
>> >
>> > In my view, the confusion arises by NOT including the multiplicity into 
>> > the occupancy. If we make the gedanken experiment and look at a range of 
>> > crystal structures with a rotationally disordered water molecule near a 
>> > symmetry axis (they do exist!) then as long as the water molecule is 
>> > sufficiently far from the axis, it is clear that the occupancy should be 
>> > 1/2 or 1/3 or whatever is the multiplicity. However, as the molecule 
>> > approaches the axis at a certain moment at a certain treshold set by the 
>> > programmer of the refinement program, the molecule suddenly becomes 
>> > special and the occupancy is set to 1.0. So depending on rounding errors, 
>> > different thresholds etc. different programs may make different decisions 
>> > on whether a water is special or not.
>> >
>> > For me, this is confusing.
>> >
>> > Best regards,
>> > Herman
>> >
>> > -----Original Message-----
>> > From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf 
>> > Of Ian Tickle
>> > Sent: Wednesday, December 15, 2010 3:47 PM
>> > To: CCP4BB@JISCMAIL.AC.UK
>> > Subject: Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein 
>> > atoms
>> >
>> > Dear George
>> >
>> > I notice that the Oxford CRYSTALS program, which is what I used when I did 
>> > small-molecule crystallography and which is still quite popular among the 
>> > small-molecule people (maybe not as much as Shel-X!), uses the CIF 
>> > convention:
>> >
>> > OCC= This parameter defines the site occupancy EXCLUDING special position 
>> > effects (i.e. is the 'chemical occupancy'). The default is 1.0.  Special 
>> > position effects are computed by CRYSTALS and multiplied onto this 
>> > parameter.
>> >
>> > (from http://www.xtl.ox.ac.uk/crystalsmanual-atomic.html )
>> >
>> > Also the mmCIF specification on this is the same CIF one (hardly 
>> > surprising I guess since it's derived from it):
>> >
>> > _atom_site.occupancy  The fraction of the atom type present at this site.
>> > The sum of the occupancies of all the atom types at this site may not 
>> > significantly exceed 1.0 unless it is a dummy site.
>> >
>> > (from 
>> > http://mmcif.pdb.org/dictionaries/mmcif_std.dic/Items/_atom_site.oc
>> > cupancy.html
>> > )
>> >
>> > which doesn't say so specifically, but it's implied since if the 
>> > multiplicity is included then the maximum value of the sum is the 
>> > multiplicity, not 1.0.
>> >
>> > So there's a real possibility of user - and programmer - confusion here!  
>> > I must say that until I looked at the 4INS file I had assumed that the PDB 
>> > occupancy was what it claimed to be, i.e. the real 'chemical' occupancy 
>> > not the multiplicity-fudged one.
>> >
>> > Cheers
>> >
>> > -- Ian
>> >
>> > On Wed, Dec 15, 2010 at 1:53 PM, George M. Sheldrick 
>> > <gshe...@shelx.uni-ac.gwdg.de> wrote:
>> >>
>> >> Dear Ian,
>> >>
>> >> Yes. Once an atom has been identified as on a special position 
>> >> because it is within a specied tolerance, SHELXL applies the 
>> >> appropriate contraints to both the coordinates and the Uij so 
>> >> there is no danger of the atom wandering off the special position. 
>> >> Usually, when an atom it very close to a special position but not 
>> >> actually on it, it is part of a disordered solvent molecule and 
>> >> will be prevented from misbehaving by distance and Uij restraints 
>> >> imposed by the user; in such a case the user usually also switches 
>> >> off the special position check for that disordered molecule (SPEC 
>> >> -1) to avoid atoms being idealized onto the special position by 
>> >> the program. For solvent molecules disordered on special positions 
>> >> it is also necessary to ignore symmetry equivalent atoms when 
>> >> generating idealized hydrogen atoms etc. (PART -N in SHELXL). This 
>> >> is all routine practice in small molecule crystallography. I agree 
>> >> that the use of orthogonal rather than crystal coordinates can obscur the 
>> >> situation, e.g. for an atom on a threefold axis.
>> >>
>> >> Best wishes, George
>> >>
>> >> Prof. George M. Sheldrick FRS
>> >> Dept. Structural Chemistry,
>> >> University of Goettingen,
>> >> Tammannstr. 4,
>> >> D37077 Goettingen, Germany
>> >> Tel. +49-551-39-3021 or -3068
>> >> Fax. +49-551-39-22582
>> >>
>> >>
>> >> On Wed, 15 Dec 2010, Ian Tickle wrote:
>> >>
>> >>> Dear George
>> >>>
>> >>> I would say that an atom has fractional occupancy (but unit
>> >>> multiplicity) unless it's exactly on the special position (though 
>> >>> I can foresee problems with rounding of decimal places for an 
>> >>> atom say at x=1/3), so that effectively once the atom is fixed 
>> >>> exactly on the s.p. the symmetry copies coalesce into a single 
>> >>> atom with unit occupancy (but fractional multiplicity).  This is 
>> >>> at least one advantage of having co-ordinates stored as 
>> >>> fractional - it would probably be more tricky with orthogonalised 
>> >>> co-ordinates.  Presumably once an input atom has satisfied the 
>> >>> condition of being 'sufficiently close' to a s.p. to be 
>> >>> considered as 'on' the s.p. then the constraints fix the 
>> >>> co-ordinates exactly on the special position and henceforth it's 
>> >>> forcibly prevented from moving off it?  In any case if an atom is 
>> >>> very close to its symmetry copy you are going to have matrix 
>> >>> conditioning problems for the co-ordinates perpendicular to the 
>> >>> axis of symmetry (or mirror plane), so then you have no choice 
>> >>> but to disallow co-ordinate shifts of the atom which would take it off 
>> >>> the special position?
>> >>>
>> >>> Cheers
>> >>>
>> >>> -- Ian
>> >>>
>> >>> On Wed, Dec 15, 2010 at 11:42 AM, George M. Sheldrick 
>> >>> <gshe...@shelx.uni-ac.gwdg.de> wrote:
>> >>> >
>> >>> > Dear Ian,
>> >>> >
>> >>> > Of course I could convert the occupancy on reading the atom in 
>> >>> > and convert it back agains on reading it out. This is not quite 
>> >>> > so trivial as it sounds because I need to set a threshold as to 
>> >>> > how close the atom has to be to a special position to be 
>> >>> > treated as special, and take care that rounding errors have the 
>> >>> > same effect on input and output and that the coordinates have 
>> >>> > not moved in or out of the special zone in the meantime.
>> >>> >
>> >>> > As it stands in SHELX, an atom that is near to a twofold will 
>> >>> > have an occupancy of 0.5 whether it is disordered close to a 
>> >>> > special position or whether it is really special, so this is never a 
>> >>> > problem.
>> >>> >
>> >>> > SHELXL is mainly used for small molecules that frequently have 
>> >>> > atoms on speical positions, and disordered solvent molecules 
>> >>> > approximately on sppecial positions are also very common (for 
>> >>> > example in centrosymmetric space groups toluene usually lies on 
>> >>> > the center of symmetry). Occupancies are often tied to free 
>> >>> > variables which would also complicate any changes to the code. 
>> >>> > And in any case, SHELX has been upwards compatible for the last 35 
>> >>> > years and I wish it to remain that way.
>> >>> >
>> >>> > Best wishes, George
>> >>> >
>> >>> > Prof. George M. Sheldrick FRS
>> >>> > Dept. Structural Chemistry,
>> >>> > University of Goettingen,
>> >>> > Tammannstr. 4,
>> >>> > D37077 Goettingen, Germany
>> >>> > Tel. +49-551-39-3021 or -3068
>> >>> > Fax. +49-551-39-22582
>> >>> >
>> >>> >
>> >>> > On Wed, 15 Dec 2010, Ian Tickle wrote:
>> >>> >
>> >>> >> Dear George
>> >>> >>
>> >>> >> Is applying the multiplicity factor to the occupancy 
>> >>> >> internally in the program such a issue anyway?  It need only 
>> >>> >> be done once per atom on input (i.e. you multiply each input 
>> >>> >> occupancy by the multiplicity to get the combined 
>> >>> >> multiplicity*occupancy value that you would have reading in 
>> >>> >> directly in the current version), and then once per atom again 
>> >>> >> on output, reversing the process.  There shouldn't be any need 
>> >>> >> to change anything in the inner atom/reflection loop where obviously 
>> >>> >> it would indeed have slowed things down.
>> >>> >>
>> >>> >> I can see though that the backwards-compatibility issue is 
>> >>> >> more serious.  However I suspect it will affect only a small 
>> >>> >> proportion of cases (though I accept that the fact that it may 
>> >>> >> affect any at all may be sufficient grounds for you to reject 
>> >>> >> it!).  If the input value exceeds the multiplicity we can say 
>> >>> >> that it's definitely an occupancy (otherwise clearly the 
>> >>> >> occupancy would be
>> >>> >> > 1).  If it's less there's an ambiguity for sure; however 
>> >>> >> > then
>> >>> >> it's more likely to be the multiplicity*occupancy (so the 
>> >>> >> occupancy is nearer to 1), on the grounds that small 
>> >>> >> occupancies are less likely to be observed, because the effect 
>> >>> >> on diffraction will be less significant.  I accept that 
>> >>> >> second-guessing the user's intentions in this way is not 
>> >>> >> ideal!  I wonder how often fractional occupancies are observed at 
>> >>> >> special positions anyway?
>> >>> >>
>> >>> >> Regards
>> >>> >>
>> >>> >> -- Ian
>> >>> >>
>> >>> >> On Fri, Dec 10, 2010 at 11:28 PM, George M. Sheldrick 
>> >>> >> <gshe...@shelx.uni-ac.gwdg.de> wrote:
>> >>> >> > SHELXL also expects that the occupancy of a fully occupied 
>> >>> >> > atom on a threefold axis should be set at 1/3, and will 
>> >>> >> > generate this automatically if necessary. It will also 
>> >>> >> > generate automatically the necessary constraints for the x, 
>> >>> >> > y and z parameters (and for the Uij if the atom is 
>> >>> >> > anisotropic). It is essential that this is done correctly if 
>> >>> >> > a full-matrix refinement is being performed (e.g. to get esd 
>> >>> >> > estimates), otherwise the refinement can explode. The user 
>> >>> >> > may change or switch off the tolerance for detecting whether 
>> >>> >> > an atom is on a special position (with the SPEC 
>> >>> >> > instruction). Setting the occupancy to a fraction avoided a 
>> >>> >> > complicated IF construction inside a loop and 35 years ago 
>> >>> >> > computers were so slow! I can't change it now because I have 
>> >>> >> > to preserve upwards compatibility. Unfortunately the CIF 
>> >>> >> > committee decided to use the other definition (i.e. the Zn 
>> >>> >> > on the threefold axis has an occupancy of 1.0) and this has caused 
>> >>> >> > considerable confusion in the small molecule world ever since; 
>> >>> >> > atoms are frequently encountered on special positions in inorganic 
>> >>> >> > and mineral structures.
>> >>> >> >
>> >>> >> > George
>> >>> >> >
>> >>> >> > Prof. George M. Sheldrick FRS Dept. Structural Chemistry, 
>> >>> >> > University of Goettingen, Tammannstr. 4,
>> >>> >> > D37077 Goettingen, Germany
>> >>> >> > Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-22582
>> >>> >> >
>> >>> >> >
>> >>> >> > On Fri, 10 Dec 2010, Ed Pozharski wrote:
>> >>> >> >
>> >>> >> >> On Fri, 2010-12-10 at 21:53 +0000, Ian Tickle wrote:
>> >>> >> >> > Hmmm - but shouldn't the occupancy of the Zn be 1.00 if 
>> >>> >> >> > it's on the special position
>> >>> >> >>
>> >>> >> >> Shouldn't 1/3 be better for programming purposes?  If you 
>> >>> >> >> set occupancy to 1.0, then you should specify that symmetry 
>> >>> >> >> operators do not apply for these atoms, making Fc calculation a 
>> >>> >> >> bit more cumbersome.
>> >>> >> >>
>> >>> >> >> If definition of the "asu content" is "you get full content 
>> >>> >> >> of the unit cell after applying symmetry operators", then 
>> >>> >> >> occupancy *must* be 1/3, right?
>> >>> >> >>
>> >>> >> >> The first zinc and the water are on special position, but 
>> >>> >> >> because they are not excluded from positional refinement 
>> >>> >> >> (perhaps they should be), they will drift a bit.  CNS has 
>> >>> >> >> distance cutoff for treating atoms as special positions, if 
>> >>> >> >> it jumps over the limit during, say, simulated annealing, 
>> >>> >> >> it  will cause problems.  Perhaps PROLSQ did something 
>> >>> >> >> similar.  It is a good question if it's better to fix these 
>> >>> >> >> in place or let them wobble a bit to account for some 
>> >>> >> >> potential disorder.  While I see the formal argument that 
>> >>> >> >> it should be nailed to three-fold axes, it is also true 
>> >>> >> >> that this is a mathematical compromise to simplify modeling that 
>> >>> >> >> does not reflect physical reality (i.e.
>> >>> >> >> you don't have three partially occupied zinc ions, it's 
>> >>> >> >> just one).  In any event, given that this is a 1.5A structure, 
>> >>> >> >> (-0.002 0.004) is statistically speaking the same as (0 0).
>> >>> >> >>
>> >>> >> >> Cheers,
>> >>> >> >>
>> >>> >> >> Ed.
>> >>> >> >>
>> >>> >> >> --
>> >>> >> >> "I'd jump in myself, if I weren't so good at whistling."
>> >>> >> >>                                Julian, King of Lemurs
>> >>> >> >>
>> >>> >> >>
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>>
>> >>>
>> >
>>
>>

Reply via email to