subject:"\[ccp4bb\] what to do with disordered side chains"

Re: [ccp4bb] what to do with disordered side chains

2011-04-05 Thread Dale Tronrud


On 4/4/2011 2:15 PM, Jacob Keller wrote:

I like your IMGATM proposal, but wouldn't it also potentially break
some of the programs?


   That depends on the program.  Programs I write that read PDB files
silently ignore keywords that they don't recognize.  A model with
IMGATM (or whatever keyword you standardize on) records would be
interpreted as those those dummy atoms don't exist.  If a program
died because of them, or if the PDB consumer wanted to "see" the
dummy atoms the keywords could be replaced with ATOM using a text
editor and a global substitute, and the user would be aware that
there is something different about those atoms.

   I would hope programs would be modified to do sensible things
with the dummy atoms since they would have a clear indication that
the atoms are indeed dummy.  For a graphics program, maybe the bonds
involving dummy atoms could be drawn a half brightness.  They would
be visible but clearly more ghost-like than the majority
of atoms in the model.  A refinement program could strip them out,
perform the refinement, and rebuild them at the end, if needed,
using WASNIAHC.  I expect they would also be ignored completely in
MR and homology modeling/comparison programs.  In fact, pretty much
any use I would make of the PDB file would involve discarding all
the dummy atoms, but with this scheme I could at least know for
sure which atoms are fantasy and which were build based on density.


Also--and this is a problem with deleting only
sidechain atoms in general--it seems that many, myself included, might
totally miss that an apparent "alanine" is really a trunco-lysine.
What I like is that it does get around the problem of people
over-interpreting bogus sidechains, but it falls short, perhaps, in
misleading people about what residue is there. I, for one, would not
feel that I had to click on all the alanines in a model to verify that
they were not lysines, and would be surprised and puzzled for a while
about why this ala said lys when I clicked on it. Wouldn't you be
surprised? (Well, maybe not after this thread...)


   I am surprised any time I see all the atoms in a lysine on the surface.
"What could possibly be holding that thing in place?" is what jumps to my
mind.  When I see a side chain on the surface that ends at CB or CG I
just assume it is something long and waving in the breeze.  I guess it
all depends on what you are used to looking at.

   With dummy atoms that are clearly labeled as such then the graphics
programs can be programed as I described above and we both would have
the visual cues that we desire.

   Another advantage of keeping the "dummy flag" separate from the occupancy
and B factor fields is that these are then free to be used in the way
they were intended.  Numerous times I have built side chains that are
visible to their end, but a second conformation ends at the CG.  I split
these side chains into A and B parts with a complete A and a partial B and
the group occupancies of A and B sum to 1.0.  Now if you tell me that
I have to build the entire B side chain and must flag the dummy atoms
with occ=0.0 we have a problem.  For the dummy atoms the occupancies don't
sum to 1.0 any more.  Logic tells me that the occupancy of the dummy atoms
should be the same as all the real B atoms.

   This particular case is a good example of why I don't like the idea
of building complete side chains in the absence of density.  If you are
going to build out my B conformation you have to recognize that the reason
I don't see density beyond the CG is that there is a B and C conformation
for the next CD atom (remember I already have an A conformation for CD
elsewhere).  To make a logically complete side chain I need to build
two dummy conformations for this residue and split my "real" CG, CB, and
CA B conformation atoms with no way to decide the relative occupancies of
the B and C conformations.  That's a lot of complexity for a blurry bit of
density.  Hell, I have every reason to expect that there is a D conformation
in there too - do I have to build that as well?

   If you expect such a shrub to be built for every surface lysine the
IMGATM keyword and the program WASNIAHC would allow it to be generated
and represented in an unambiguous and minimally confusing fashion.  I
wouldn't be happy having to add imaginary atoms to my models, but the
representation meets my criteria, and I think it meets yours too.

Dale Tronrud



JPK



On Mon, Apr 4, 2011 at 1:55 AM, Dale Tronrud  wrote:

   The definition of _atom_site.occupancy is

  The fraction of the atom type present at this site.
  The sum of the occupancies of all the atom types at this site
  may not significantly exceed 1.0 unless it is a dummy site.

When an atom has an occupancy equal to zero that means that the
atom is NEVER present at that site - and that is not what you
intend to say.  Setting the occupancy to zero does not mean that
a full atom is located somewhere in this area.  Quite the opposite.

   (The refe

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Robbie Joosten

Hi Jacob,

The PDB header has a record for missing atoms. Coot has an option to find them 
and any decent validation software will warn about incomplete residues. There 
are PDBREPORT entries for every PDB file with a list of incomplete residues. If 
a user makes a very small effort, he doesn't have to go around clicking every 
'alanine'.

Cheers,
Robbie

> Date: Mon, 4 Apr 2011 16:15:58 -0500
> From: j-kell...@fsm.northwestern.edu
> Subject: Re: [ccp4bb] what to do with disordered side chains
> To: CCP4BB@JISCMAIL.AC.UK
> 
> I like your IMGATM proposal, but wouldn't it also potentially break
> some of the programs? Also--and this is a problem with deleting only
> sidechain atoms in general--it seems that many, myself included, might
> totally miss that an apparent "alanine" is really a trunco-lysine.
> What I like is that it does get around the problem of people
> over-interpreting bogus sidechains, but it falls short, perhaps, in
> misleading people about what residue is there. I, for one, would not
> feel that I had to click on all the alanines in a model to verify that
> they were not lysines, and would be surprised and puzzled for a while
> about why this ala said lys when I clicked on it. Wouldn't you be
> surprised? (Well, maybe not after this thread...)
> 
> JPK
> 
> 
> 
> On Mon, Apr 4, 2011 at 1:55 AM, Dale Tronrud  
> wrote:
> >   The definition of _atom_site.occupancy is
> >
> >  The fraction of the atom type present at this site.
> >  The sum of the occupancies of all the atom types at this site
> >  may not significantly exceed 1.0 unless it is a dummy site.
> >
> > When an atom has an occupancy equal to zero that means that the
> > atom is NEVER present at that site - and that is not what you
> > intend to say.  Setting the occupancy to zero does not mean that
> > a full atom is located somewhere in this area.  Quite the opposite.
> >
> >   (The reference to a dummy site is interesting and implies to
> > me that mmCIF already has the mechanism you wish for.)
> >
> >   Having some experience with refining low occupancy atoms and
> > working with dummy marker atoms I'm quite confident that you can
> > never define a B factor cutoff that would work.  No matter what
> > value you choose you will find some atoms in density that refine
> > to values greater than the cutoff, or the limit you choose is so
> > high that you will find marker atoms that refine to less than the
> > limit.  A B factor cutoff cannot work - no matter the value you
> > choose you will always be plagued with false positives or false
> > negatives.
> >
> >   If you really want to stuff this bit into one of these fields
> > you have to go all out.  Set the occupancy of a marker atom to -99.99.
> > This will unambiguously mark the atom as an imaginary one.  This
> > will, of course, break every program that reads PDB format files,
> > but that is what should happen in any case.  If you change the
> > definition of the columns in the file you must mandate that all
> > programs be upgraded to recognized the new definitions.  I don't
> > know how you can do that other than ensuring that the change will
> > cause programs to cough.  To try to slide it by with a magic value
> > that will be silently accepted by existing programs is to beg for
> > bugs and subtle side-effects.
> >
> >   Good luck getting the maintainers of the mmCIF standard to accept
> > a magic value in either of these fields.
> >
> >   How about this: We already have the keywords ATOM and HETATM
> > (and don't ask me why we have two).  How about we create a new
> > record in the PDB format, say IMGATM, that would have all the
> > fields of an ATOM record but would be recognized as whatever the
> > marker is for "dummy" atoms in the current mmCIF?  Existing programs
> > would completely ignore these atoms, as they should until they are
> > modified to do something reasonable with them.  Those of us who
> > have no use for them can either use a switch in the program to
> > ignore them or just grep them out of the file.  Someone could write
> > a program that would take a model with only ATOM and HETATM records
> > and fill out all the desired IMGATM records (Let's call that program
> > WASNIAHC, everyone would remember that!).
> >
> >   This solution is unambiguous.  It can be represented in current
> > mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
> > but before acceptance by the depositor so people like me would not
> > have to deal with them during refinement but would be a

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Flip Hoedemaeker

It's nice to see that this discussion pops up every two years or so with 
exactly the same arguments :)


My vote (as always) is for leaving the atoms of disordered side chains 
in with high B values, the B values are part of the models. Its up to 
the popular Biologist's visualization software out there to properly 
display these models. I'm sure we can use all kinds of nice 4D blurry 
renderings of these disordered atoms nowadays.


Flip

On 4/4/2011 23:15, Jacob Keller wrote:

I like your IMGATM proposal, but wouldn't it also potentially break
some of the programs? Also--and this is a problem with deleting only
sidechain atoms in general--it seems that many, myself included, might
totally miss that an apparent "alanine" is really a trunco-lysine.
What I like is that it does get around the problem of people
over-interpreting bogus sidechains, but it falls short, perhaps, in
misleading people about what residue is there. I, for one, would not
feel that I had to click on all the alanines in a model to verify that
they were not lysines, and would be surprised and puzzled for a while
about why this ala said lys when I clicked on it. Wouldn't you be
surprised? (Well, maybe not after this thread...)

JPK



On Mon, Apr 4, 2011 at 1:55 AM, Dale Tronrud  wrote:

   The definition of _atom_site.occupancy is

  The fraction of the atom type present at this site.
  The sum of the occupancies of all the atom types at this site
  may not significantly exceed 1.0 unless it is a dummy site.

When an atom has an occupancy equal to zero that means that the
atom is NEVER present at that site - and that is not what you
intend to say.  Setting the occupancy to zero does not mean that
a full atom is located somewhere in this area.  Quite the opposite.

   (The reference to a dummy site is interesting and implies to
me that mmCIF already has the mechanism you wish for.)

   Having some experience with refining low occupancy atoms and
working with dummy marker atoms I'm quite confident that you can
never define a B factor cutoff that would work.  No matter what
value you choose you will find some atoms in density that refine
to values greater than the cutoff, or the limit you choose is so
high that you will find marker atoms that refine to less than the
limit.  A B factor cutoff cannot work - no matter the value you
choose you will always be plagued with false positives or false
negatives.

   If you really want to stuff this bit into one of these fields
you have to go all out.  Set the occupancy of a marker atom to -99.99.
This will unambiguously mark the atom as an imaginary one.  This
will, of course, break every program that reads PDB format files,
but that is what should happen in any case.  If you change the
definition of the columns in the file you must mandate that all
programs be upgraded to recognized the new definitions.  I don't
know how you can do that other than ensuring that the change will
cause programs to cough.  To try to slide it by with a magic value
that will be silently accepted by existing programs is to beg for
bugs and subtle side-effects.

   Good luck getting the maintainers of the mmCIF standard to accept
a magic value in either of these fields.

   How about this: We already have the keywords ATOM and HETATM
(and don't ask me why we have two).  How about we create a new
record in the PDB format, say IMGATM, that would have all the
fields of an ATOM record but would be recognized as whatever the
marker is for "dummy" atoms in the current mmCIF?  Existing programs
would completely ignore these atoms, as they should until they are
modified to do something reasonable with them.  Those of us who
have no use for them can either use a switch in the program to
ignore them or just grep them out of the file.  Someone could write
a program that would take a model with only ATOM and HETATM records
and fill out all the desired IMGATM records (Let's call that program
WASNIAHC, everyone would remember that!).

   This solution is unambiguous.  It can be represented in current
mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
but before acceptance by the depositor so people like me would not
have to deal with them during refinement but would be able to see
them before our precious works of art are unleashed on the world.

   Seems like a win-win solution to me.

Dale Tronrud


On 4/3/2011 9:17 PM, Jacob Keller wrote:


Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennettwrote:


Most non-structural users are familiar with the

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Jacob Keller

I like your IMGATM proposal, but wouldn't it also potentially break
some of the programs? Also--and this is a problem with deleting only
sidechain atoms in general--it seems that many, myself included, might
totally miss that an apparent "alanine" is really a trunco-lysine.
What I like is that it does get around the problem of people
over-interpreting bogus sidechains, but it falls short, perhaps, in
misleading people about what residue is there. I, for one, would not
feel that I had to click on all the alanines in a model to verify that
they were not lysines, and would be surprised and puzzled for a while
about why this ala said lys when I clicked on it. Wouldn't you be
surprised? (Well, maybe not after this thread...)

JPK



On Mon, Apr 4, 2011 at 1:55 AM, Dale Tronrud  wrote:
>   The definition of _atom_site.occupancy is
>
>  The fraction of the atom type present at this site.
>  The sum of the occupancies of all the atom types at this site
>  may not significantly exceed 1.0 unless it is a dummy site.
>
> When an atom has an occupancy equal to zero that means that the
> atom is NEVER present at that site - and that is not what you
> intend to say.  Setting the occupancy to zero does not mean that
> a full atom is located somewhere in this area.  Quite the opposite.
>
>   (The reference to a dummy site is interesting and implies to
> me that mmCIF already has the mechanism you wish for.)
>
>   Having some experience with refining low occupancy atoms and
> working with dummy marker atoms I'm quite confident that you can
> never define a B factor cutoff that would work.  No matter what
> value you choose you will find some atoms in density that refine
> to values greater than the cutoff, or the limit you choose is so
> high that you will find marker atoms that refine to less than the
> limit.  A B factor cutoff cannot work - no matter the value you
> choose you will always be plagued with false positives or false
> negatives.
>
>   If you really want to stuff this bit into one of these fields
> you have to go all out.  Set the occupancy of a marker atom to -99.99.
> This will unambiguously mark the atom as an imaginary one.  This
> will, of course, break every program that reads PDB format files,
> but that is what should happen in any case.  If you change the
> definition of the columns in the file you must mandate that all
> programs be upgraded to recognized the new definitions.  I don't
> know how you can do that other than ensuring that the change will
> cause programs to cough.  To try to slide it by with a magic value
> that will be silently accepted by existing programs is to beg for
> bugs and subtle side-effects.
>
>   Good luck getting the maintainers of the mmCIF standard to accept
> a magic value in either of these fields.
>
>   How about this: We already have the keywords ATOM and HETATM
> (and don't ask me why we have two).  How about we create a new
> record in the PDB format, say IMGATM, that would have all the
> fields of an ATOM record but would be recognized as whatever the
> marker is for "dummy" atoms in the current mmCIF?  Existing programs
> would completely ignore these atoms, as they should until they are
> modified to do something reasonable with them.  Those of us who
> have no use for them can either use a switch in the program to
> ignore them or just grep them out of the file.  Someone could write
> a program that would take a model with only ATOM and HETATM records
> and fill out all the desired IMGATM records (Let's call that program
> WASNIAHC, everyone would remember that!).
>
>   This solution is unambiguous.  It can be represented in current
> mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
> but before acceptance by the depositor so people like me would not
> have to deal with them during refinement but would be able to see
> them before our precious works of art are unleashed on the world.
>
>   Seems like a win-win solution to me.
>
> Dale Tronrud
>
>
> On 4/3/2011 9:17 PM, Jacob Keller wrote:
>>
>> Well, what about getting the default settings on the major molecular
>> viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
>> While the b cutoff is still be tricky, I assume we could eventually
>> come to consensus on some reasonable cutoff (2 sigma from the mean?),
>> and then this approach would allow each free-spirited crystallographer
>> to keep his own preferred method of dealing with these troublesome
>> sidechains and nary a novice would be led astray
>>
>> JPK
>>
>> On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:
>>>
>>> Most non-structural users are familiar with the sequence of the proteins
>>> they are studying, and most software does at least display residue identity
>>> if you select an atom in a residue, so usually it is not necessary to do any
>>> cross checking besides selecting an atom in the residue and seeing what its
>>> residue name is.  The chance of somebody misinterpreting a truncated Lys as
>>>

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Ed Pozharski

On Mon, 2011-04-04 at 09:38 -0500, Jacob Keller wrote:
> Could it be that they are not normal because of all of the outlier,
> huge-b-factor sidechains? 

That is part of it

> If every exposed sidechain without real
> density gets a b-factor of 150, wouldn't that make a sizeable and
> illegitimate non-normal population?

It will surely skew the standard deviation to higher values

>  I would actually be curious about
> normality of b-factors--is there such a study/figure out there
> somewhere, with, say, histograms of b-factors of many individual
> structures? 

Well, technically speaking they cannot possibly obey normal distribution
since they are always positive.  Another reason is, of course, that
there are different atom types (e.g. side chains versus backbone) and
you have at best multi-modal distribution.  In my experience, the
B-factor distributions always fail normality tests even when you break
atoms to groups (the most "normal" are, somewhat expectedly, waters, yet
they fail too).

Cheers,

Ed.

PS.  If you keep going at this - start a new thread

-- 
"I'd jump in myself, if I weren't so good at whistling."
   Julian, King of Lemurs

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Pavel Afonine

Hi Robbie,

 I updated my stripper program to remove all atoms with occ<0.00 instead of
> 0.00
>

- I used to do it in the past in phenix.model_vs_data but then I found it
too boring since it was silently swallowing the problem cases -:) so I
reverted it back to the "naive mode" when it takes what's in PDB as "God
given" and computes the stats using it. This in turn points out problem
cases, which is instructive, and which one can take care of on a second
walk-through.

- Also, you will be missing things like this:

>  http://www.rcsb.org/pdb/files/3otj.pdb
http://www.rcsb.org/pdb/files/3kcj.pdb
(.. and so on, I have a full list)
where, I guess, "D" with negative occupancy actually means H -:)

All the best!
Pavel.

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Robbie Joosten


Nice one Pavel. PDB_REDO actually runs on these files but it's not pretty. I 
updated my stripper program to remove all atoms with occ<0.00 instead of 0.00
 
Cheers,
Robbie
 


Date: Mon, 4 Apr 2011 07:26:23 -0700
From: pafon...@gmail.com
Subject: Re: [ccp4bb] what to do with disordered side chains
To: CCP4BB@JISCMAIL.AC.UK

Hi Dale,



 Set the occupancy of a marker atom to -99.99.
This will unambiguously mark the atom as an imaginary one.  This
will, of course, break every program that reads PDB format files,


may be not every -:)


phenix.model_vs_data works just fine with


http://www.rcsb.org/pdb/files/1BQU.pdb
http://www.rcsb.org/pdb/files/1azr.pdb


(Um... I guess I just created some work for PDB_REDO folks, sorry -:) )


All the best!
Pavel.

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Robbie Joosten

Dear James,

You make a very good point. So far we only discussed the option of removing 
alls side chain atoms except for CB. What if only a few side chain atoms are 
outside the density? Should we just remove those? If we use the argument that 
we should remove the atoms we cannot see, then surely we should keep the ones 
we can see. 
A problem is that if someone else recalculates the maps and inspects them 
(which is the point of the EDS), some atoms may very well fall inside or 
outside the density differently than in the original crystallographic study: 
software changes, other reflections are included (remember the recent I/sigI 
discussion), and last (but not least) when atoms are removed the solvent mask 
changes. 

Anyway, I don't think that the side chain discussion will be solved in this 
thread. PDB users are not all the same and treat the options that are proposed 
differently. They are all used in the PDB and that complicates matters for the 
users. In PDB_REDO (plug, plug ;) we build all missing side chains (and rebuild 
the zero-occupancy ones) and let the B-factor sort it out. Not everyone will 
agree with this, but at least it is consistent. If anyone studies a single 
structure properly, he should use the density and there is no problem. For 
statistics studies the first thing people do is filter by resolution and 
B-factor (and sequence identity) so the really bad side chains are removed from 
the testset anyway.

Cheers,
Robbie

> Date: Sun, 3 Apr 2011 23:45:10 -0700
> From: jmhol...@lbl.gov
> Subject: Re: [ccp4bb] what to do with disordered side chains
> To: CCP4BB@JISCMAIL.AC.UK
> 
> At the risk of throwing a little gasoline on the flame war, what about 
> side chains that will ALWAYS poke outside of the electron density? For 
> example, pretty much any terminal aliphatic at 3.5 A resolution? I 
> first learned this about 15 years ago when I made this movie:
> 
> http://bl831.als.lbl.gov/~jamesh/movies/resolution.mpeg
> 
> For those of you whose browser no longer supports MPEG-1, this is a 
> movie of a calculated (aka noise-free) electron density map, contoured 
> at "1 sigma", but cut to the resolution shown after applying an overall 
> B factor sufficient to suppress series-termination. By that I mean the 
> maps don't look all that different with or without the cutoff. The 
> coordinates shown are the "correct" model used to calculate the map. At 
> about 3.5 A you start to see side chains poking out of the density, and 
> at 6 A, all the side chains are "gone". Does this mean they should be 
> modeled with zero occupancy? ;)
> 
> -James Holton
> MAD Scientist
> 
> 
> On 4/3/2011 9:57 PM, Maia Cherney wrote:
> > I guess, most hydrophilic side chains on the surface are flexible, 
> > they don't keep the same conformation. If you cut those side chains 
> > off, the surface will be looking pretty hydrophobic and misleading 
> > (and very horrible). I prefer to see them intact. I know, most of them 
> > are flexible and don't have one exact position, but it's OK. I know 
> > they are there not far from the main chain. Usually, their exact 
> > position is irrelevant.
> >
> > Maia
> >
> >
> >
> > Jacob Keller wrote:
> >> Well, what about getting the default settings on the major molecular
> >> viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
> >> While the b cutoff is still be tricky, I assume we could eventually
> >> come to consensus on some reasonable cutoff (2 sigma from the mean?),
> >> and then this approach would allow each free-spirited crystallographer
> >> to keep his own preferred method of dealing with these troublesome
> >> sidechains and nary a novice would be led astray
> >>
> >> JPK
> >>
> >> On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:
> >>> Most non-structural users are familiar with the sequence of the 
> >>> proteins they are studying, and most software does at least display 
> >>> residue identity if you select an atom in a residue, so usually it 
> >>> is not necessary to do any cross checking besides selecting an atom 
> >>> in the residue and seeing what its residue name is. The chance of 
> >>> somebody misinterpreting a truncated Lys as Ala is, in my 
> >>> experience, much much lower than the chance they will trust the xyz 
> >>> coordinates of atoms with zero occupancy or high B factors.
> >>>
> >>> What worries me the most is somebody designing a whole biological 
> >>> experiment around an over-interpretation of details that are implied 
> >>> by xyz c

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Jacob Keller

> Not likely - the distribution of ADPs is not normal, so you can't easily
> convert Z-scores to probabilities.

Could it be that they are not normal because of all of the outlier,
huge-b-factor sidechains? If every exposed sidechain without real
density gets a b-factor of 150, wouldn't that make a sizeable and
illegitimate non-normal population? I would actually be curious about
normality of b-factors--is there such a study/figure out there
somewhere, with, say, histograms of b-factors of many individual
structures?




-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Pavel Afonine

Hi Dale,

 Set the occupancy of a marker atom to -99.99.
> This will unambiguously mark the atom as an imaginary one.  This
> will, of course, break every program that reads PDB format files,


may be not every -:)

phenix.model_vs_data works just fine with

http://www.rcsb.org/pdb/files/1BQU.pdb
http://www.rcsb.org/pdb/files/1azr.pdb

(Um... I guess I just created some work for PDB_REDO folks, sorry -:) )

All the best!
Pavel.

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Ed Pozharski

On Sun, 2011-04-03 at 23:17 -0500, Jacob Keller wrote:
> While the b cutoff is still be tricky, I assume we could eventually
> come to consensus on some reasonable cutoff (2 sigma from the mean?) 

Not likely - the distribution of ADPs is not normal, so you can't easily
convert Z-scores to probabilities.  

-- 
"I'd jump in myself, if I weren't so good at whistling."
   Julian, King of Lemurs

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Dale Tronrud


   The definition of _atom_site.occupancy is

 The fraction of the atom type present at this site.
  The sum of the occupancies of all the atom types at this site
  may not significantly exceed 1.0 unless it is a dummy site.

When an atom has an occupancy equal to zero that means that the
atom is NEVER present at that site - and that is not what you
intend to say.  Setting the occupancy to zero does not mean that
a full atom is located somewhere in this area.  Quite the opposite.

   (The reference to a dummy site is interesting and implies to
me that mmCIF already has the mechanism you wish for.)

   Having some experience with refining low occupancy atoms and
working with dummy marker atoms I'm quite confident that you can
never define a B factor cutoff that would work.  No matter what
value you choose you will find some atoms in density that refine
to values greater than the cutoff, or the limit you choose is so
high that you will find marker atoms that refine to less than the
limit.  A B factor cutoff cannot work - no matter the value you
choose you will always be plagued with false positives or false
negatives.

   If you really want to stuff this bit into one of these fields
you have to go all out.  Set the occupancy of a marker atom to -99.99.
This will unambiguously mark the atom as an imaginary one.  This
will, of course, break every program that reads PDB format files,
but that is what should happen in any case.  If you change the
definition of the columns in the file you must mandate that all
programs be upgraded to recognized the new definitions.  I don't
know how you can do that other than ensuring that the change will
cause programs to cough.  To try to slide it by with a magic value
that will be silently accepted by existing programs is to beg for
bugs and subtle side-effects.

   Good luck getting the maintainers of the mmCIF standard to accept
a magic value in either of these fields.

   How about this: We already have the keywords ATOM and HETATM
(and don't ask me why we have two).  How about we create a new
record in the PDB format, say IMGATM, that would have all the
fields of an ATOM record but would be recognized as whatever the
marker is for "dummy" atoms in the current mmCIF?  Existing programs
would completely ignore these atoms, as they should until they are
modified to do something reasonable with them.  Those of us who
have no use for them can either use a switch in the program to
ignore them or just grep them out of the file.  Someone could write
a program that would take a model with only ATOM and HETATM records
and fill out all the desired IMGATM records (Let's call that program
WASNIAHC, everyone would remember that!).

   This solution is unambiguous.  It can be represented in current
mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
but before acceptance by the depositor so people like me would not
have to deal with them during refinement but would be able to see
them before our precious works of art are unleashed on the world.

   Seems like a win-win solution to me.

Dale Tronrud


On 4/3/2011 9:17 PM, Jacob Keller wrote:

Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:

Most non-structural users are familiar with the sequence of the proteins they 
are studying, and most software does at least display residue identity if you 
select an atom in a residue, so usually it is not necessary to do any cross 
checking besides selecting an atom in the residue and seeing what its residue 
name is.  The chance of somebody misinterpreting a truncated Lys as Ala is, in 
my experience, much much lower than the chance they will trust the xyz 
coordinates of atoms with zero occupancy or high B factors.

What worries me the most is somebody designing a whole biological experiment around an 
over-interpretation of details that are implied by xyz coordinates of atoms, even if 
those atoms were not resolved in the maps.  When this sort of error occurs it is a level 
of pain and wasted effort that makes the "pain" associated with having to build 
back in missing side chains look completely trivial.

As long as the PDB file format is the way users get structural data, there is really no 
good way to communicate "atom exists with no reliable coordinates" to the user, 
given the diversity of software packages out there for reading PDB files and the 
historical lack of any standard way of dealing with this issue.  Even if the file format 
is hacked there is no way to force all the existing software out

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread James Holton

At the risk of throwing a little gasoline on the flame war, what about 
side chains that will ALWAYS poke outside of the electron density?  For 
example, pretty much any terminal aliphatic at 3.5 A resolution?  I 
first learned this about 15 years ago when I made this movie:


http://bl831.als.lbl.gov/~jamesh/movies/resolution.mpeg

For those of you whose browser no longer supports MPEG-1, this is a 
movie of a calculated (aka noise-free) electron density map, contoured 
at "1 sigma", but cut to the resolution shown after applying an overall 
B factor sufficient to suppress series-termination.  By that I mean the 
maps don't look all that different with or without the cutoff.  The 
coordinates shown are the "correct" model used to calculate the map.  At 
about 3.5 A you start to see side chains poking out of the density, and 
at 6 A, all the side chains are "gone".  Does this mean they should be 
modeled with zero occupancy?  ;)


-James Holton
MAD Scientist


On 4/3/2011 9:57 PM, Maia Cherney wrote:
I guess, most hydrophilic side chains on the surface are flexible, 
they don't keep the same conformation. If you cut those side chains 
off, the surface will be looking pretty hydrophobic and misleading 
(and very horrible). I prefer to see them intact. I know, most of them 
are flexible and don't have one exact position, but it's OK. I know 
they are there not far from the main chain. Usually, their exact 
position is irrelevant.


Maia



Jacob Keller wrote:

Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:
Most non-structural users are familiar with the sequence of the 
proteins they are studying, and most software does at least display 
residue identity if you select an atom in a residue, so usually it 
is not necessary to do any cross checking besides selecting an atom 
in the residue and seeing what its residue name is.  The chance of 
somebody misinterpreting a truncated Lys as Ala is, in my 
experience, much much lower than the chance they will trust the xyz 
coordinates of atoms with zero occupancy or high B factors.


What worries me the most is somebody designing a whole biological 
experiment around an over-interpretation of details that are implied 
by xyz coordinates of atoms, even if those atoms were not resolved 
in the maps.  When this sort of error occurs it is a level of pain 
and wasted effort that makes the "pain" associated with having to 
build back in missing side chains look completely trivial.


As long as the PDB file format is the way users get structural data, 
there is really no good way to communicate "atom exists with no 
reliable coordinates" to the user, given the diversity of software 
packages out there for reading PDB files and the historical lack of 
any standard way of dealing with this issue.  Even if the file 
format is hacked there is no way to force all the existing software 
out there to understand the hack.  A file format that isn't designed 
with this sort of feature from day one is not going to be fixable as 
a practical matter after so much legacy code has accumulated.


-Eric



On Apr 3, 2011, at 2:20 PM, Jacob Keller wrote:


To the delete-the-atom-nik's: do you propose deleting the whole
residue or just the side chain? I can understand deleting the whole
residue, but deleting only the side chain seems to me to be placing a
stumbling block also, and even possibly confusing for an experienced
crystallographer: the .pdb says "lys" but it looks like an ala? Which
is it? I could imagine a lot of frustration-hours arising from this
practice, with people cross-checking sequences, looking in the methods
sections for mutations...

JPK

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Maia Cherney

I guess, most hydrophilic side chains on the surface are flexible, they 
don't keep the same conformation. If you cut those side chains off, the 
surface will be looking pretty hydrophobic and misleading (and very 
horrible). I prefer to see them intact. I know, most of them are 
flexible and don't have one exact position, but it's OK. I know they are 
there not far from the main chain. Usually, their exact position is 
irrelevant.


Maia



Jacob Keller wrote:

Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:
  

Most non-structural users are familiar with the sequence of the proteins they 
are studying, and most software does at least display residue identity if you 
select an atom in a residue, so usually it is not necessary to do any cross 
checking besides selecting an atom in the residue and seeing what its residue 
name is.  The chance of somebody misinterpreting a truncated Lys as Ala is, in 
my experience, much much lower than the chance they will trust the xyz 
coordinates of atoms with zero occupancy or high B factors.

What worries me the most is somebody designing a whole biological experiment around an 
over-interpretation of details that are implied by xyz coordinates of atoms, even if 
those atoms were not resolved in the maps.  When this sort of error occurs it is a level 
of pain and wasted effort that makes the "pain" associated with having to build 
back in missing side chains look completely trivial.

As long as the PDB file format is the way users get structural data, there is really no 
good way to communicate "atom exists with no reliable coordinates" to the user, 
given the diversity of software packages out there for reading PDB files and the 
historical lack of any standard way of dealing with this issue.  Even if the file format 
is hacked there is no way to force all the existing software out there to understand the 
hack.  A file format that isn't designed with this sort of feature from day one is not 
going to be fixable as a practical matter after so much legacy code has accumulated.

-Eric



On Apr 3, 2011, at 2:20 PM, Jacob Keller wrote:



To the delete-the-atom-nik's: do you propose deleting the whole
residue or just the side chain? I can understand deleting the whole
residue, but deleting only the side chain seems to me to be placing a
stumbling block also, and even possibly confusing for an experienced
crystallographer: the .pdb says "lys" but it looks like an ala? Which
is it? I could imagine a lot of frustration-hours arising from this
practice, with people cross-checking sequences, looking in the methods
sections for mutations...

JPK

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Jacob Keller

Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett  wrote:
> Most non-structural users are familiar with the sequence of the proteins they 
> are studying, and most software does at least display residue identity if you 
> select an atom in a residue, so usually it is not necessary to do any cross 
> checking besides selecting an atom in the residue and seeing what its residue 
> name is.  The chance of somebody misinterpreting a truncated Lys as Ala is, 
> in my experience, much much lower than the chance they will trust the xyz 
> coordinates of atoms with zero occupancy or high B factors.
>
> What worries me the most is somebody designing a whole biological experiment 
> around an over-interpretation of details that are implied by xyz coordinates 
> of atoms, even if those atoms were not resolved in the maps.  When this sort 
> of error occurs it is a level of pain and wasted effort that makes the "pain" 
> associated with having to build back in missing side chains look completely 
> trivial.
>
> As long as the PDB file format is the way users get structural data, there is 
> really no good way to communicate "atom exists with no reliable coordinates" 
> to the user, given the diversity of software packages out there for reading 
> PDB files and the historical lack of any standard way of dealing with this 
> issue.  Even if the file format is hacked there is no way to force all the 
> existing software out there to understand the hack.  A file format that isn't 
> designed with this sort of feature from day one is not going to be fixable as 
> a practical matter after so much legacy code has accumulated.
>
> -Eric
>
>
>
> On Apr 3, 2011, at 2:20 PM, Jacob Keller wrote:
>
>> To the delete-the-atom-nik's: do you propose deleting the whole
>> residue or just the side chain? I can understand deleting the whole
>> residue, but deleting only the side chain seems to me to be placing a
>> stumbling block also, and even possibly confusing for an experienced
>> crystallographer: the .pdb says "lys" but it looks like an ala? Which
>> is it? I could imagine a lot of frustration-hours arising from this
>> practice, with people cross-checking sequences, looking in the methods
>> sections for mutations...
>>
>> JPK
>>
>



-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Eric Bennett

Most non-structural users are familiar with the sequence of the proteins they 
are studying, and most software does at least display residue identity if you 
select an atom in a residue, so usually it is not necessary to do any cross 
checking besides selecting an atom in the residue and seeing what its residue 
name is.  The chance of somebody misinterpreting a truncated Lys as Ala is, in 
my experience, much much lower than the chance they will trust the xyz 
coordinates of atoms with zero occupancy or high B factors.

What worries me the most is somebody designing a whole biological experiment 
around an over-interpretation of details that are implied by xyz coordinates of 
atoms, even if those atoms were not resolved in the maps.  When this sort of 
error occurs it is a level of pain and wasted effort that makes the "pain" 
associated with having to build back in missing side chains look completely 
trivial.

As long as the PDB file format is the way users get structural data, there is 
really no good way to communicate "atom exists with no reliable coordinates" to 
the user, given the diversity of software packages out there for reading PDB 
files and the historical lack of any standard way of dealing with this issue.  
Even if the file format is hacked there is no way to force all the existing 
software out there to understand the hack.  A file format that isn't designed 
with this sort of feature from day one is not going to be fixable as a 
practical matter after so much legacy code has accumulated.

-Eric

On Apr 3, 2011, at 2:20 PM, Jacob Keller wrote:

> To the delete-the-atom-nik's: do you propose deleting the whole
> residue or just the side chain? I can understand deleting the whole
> residue, but deleting only the side chain seems to me to be placing a
> stumbling block also, and even possibly confusing for an experienced
> crystallographer: the .pdb says "lys" but it looks like an ala? Which
> is it? I could imagine a lot of frustration-hours arising from this
> practice, with people cross-checking sequences, looking in the methods
> sections for mutations...
> 
> JPK
>

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Bernhard Rupp (Hofkristallrat a.D.)

I vote for the electron density irrespective of side chains, main chains,
ligands, dark matter. The PDB is a collection of experimentally determined
structures per its own definition. If density supports it high B is fine -
B-factor simply is a parameter of a probability distribution. If you extend
that to no density - it becomes problematic. After all, coordinates imply
that an atom is actually at some specified place with a certain probability.
We may know that the atom necessarily has to be someplace lest it got chewed
off for some reason. The experiment just tells you that you do not know
where the atom is. 

Or in Rumsfeldic: Better a known unknown than a unknown known.

Cheers, BR

-Original Message-
From: Boaz Shaanan [mailto:bshaa...@exchange.bgu.ac.il] 
Sent: Sunday, April 03, 2011 11:02 AM
To: hofkristall...@gmail.com; CCP4BB@JISCMAIL.AC.UK
Subject: RE: [ccp4bb] what to do with disordered side chains

The original posting that started this thread referred to side-chains, as
the subject still suggests. Do you propose to omit only side-chain atoms, in
which case you end up with different residues, as pointed out by quite a few
people,or do you suggest also to omit the main-chain atoms of the
problematic residues ?

Besides, as mentioned by Phoebe and others, many users
(non-crystallographers) of PDB's know already  the meaning of the B-factor
and will know how to interpret a very high B. It is our task (the
crystallographers) to enllighten those who don't know what the B column in a
PDB entry stands for. I certainly do and I'm sure many of us do so too. I
voted for high B and would vote for it again, if asked.

Cheers,

   Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp
(Hofkristallrat a.D.) [hofkristall...@gmail.com]
Sent: Sunday, April 03, 2011 7:42 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to model them
in, but they should be aware that they are modeling them.
Joel L. Sussman

Concur.  BMC p 680 'How to handle missing parts'

Best wishes, BR

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

Doing something sensible in the major software packages, both for graphics
and for other analysis of the structure, could solve the problem for most
users.

But nobody knows what other software is out there being used by individuals
or small groups.  And the more remote the authors of that software are from
protein structure solution the more likely it is that they have not/will not
properly handle atoms with zero occupancy or high B values, for example.

I am absolutely positive that there is software that does its voodoo on
ATOM/HETATM records and pays absolutely no attention to anything beyond the
x, y, z coordinates (i.e. beyond column 54).

   Frances Bernstein

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
 *   ***
f...@bernstein-plus-sons.com<mailto:f...@bernstein-plus-sons.com>
*** *
 *   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Sat, 2 Apr 2011, Jacob Keller wrote:

I guess I missed it in the flurry of replies to this thread over the last
few days, but what exactly is so terrible about keeping the atoms (since you
have chemical evidence from protein sequence that they are there, and even
if there is X-ray damage they were originally there and are likely still
there in a subset of the molecules), but changing occupancy to zero as an
acknowledgment that your data does not provide evidence to support a
specific atomic position for these atoms?

Some users might pull up the structure, see those atoms, and think their
positions were based on data, which they were not, and then draw conclusions
based on them. I agree that occ=0 is tantamount to the suggestion you
queried, however.

A somewhat key question might be: across the various molecular visualization
programs, what is the default way to handle atoms with occ=0? Perhaps those
programs might be the best place to fix the problem...

JPK


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu<mailto:j-kell...@northwestern.edu>
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Dale Tronrud


   Clearly there are strong feelings held by the advocates of the
several solutions to the problem of what to do about atoms that
cannot be reliably placed based on the electron density map.  I
certainly understand since I passionately support my own favorite
solution.

   Why is it that a community of generally reasonable people keep
coming back to this same issue and yet fail to find a solution
that can reach some kind of consensus?

   My 2 cents on this, more fundamental, issue:

   A model created by someone who believes that all atoms (for a
residue with any atoms) must be built will contain two kinds of
atoms.  Those placed based on the appearance of the electron
density and those placed in some convenient location simply to
fill out the atom count.  I think most everyone agrees that a
full residue is a convenience for some users of our models.  What
those of us who favor partial models want is an absolutely clear
distinction between the two classes of atoms.

   All this needs is a bit.  Literally, one bit of data that flags
those atoms added to the model simply to complete the set.

   Why can't we come to a solution that satisfies?  Because we
continue to use a non-extensible file format that does not allow
us a place to put this bit.

   Some people want to put the bit in the occupancy column by
defining a special value (occ=0) that would be the flag.  Some
people want to put it in the B factor column by defining a special
value there (a couple possibilities here, B=1000.00, B=500.00,
B varying but larger than that of any atom built into density).

   The B factor and occupancy columns in the PDB file have been
precisely defined back when the mmCIF dictionary was created and
to change their definitions now would require opening that process
again.  I am pretty sure that committee in charge will never allow
a definition for these items that includes the phrase "... except when
the value is equal too ...".  You can't run a database that way.

   Each piece of information has to have its own tag and definition.
Once it is defined we can embrace the task of educating software
developers and our collaborators who use our models in its meaning.

   There is just no place to put this bit in a PDB format file.
mmCIF - its trivial.  PDB format - no way.  As long as we insist that
this format is the preferred means of distributing our models we
will continue to return to this argument again and again with no
possibility of coming to a solution.

Dale Tronrud

P.S. I've even thought about using the model of the "REMARK" statement,
where all sorts of information have been added by the hack of
"standardized remarks".  I thought that one could create a
"standardized footnote" that would mark the atoms as "imaginary".
I found that, unfortunately, footnotes were removed from the PDB
format many years ago.



On 4/3/2011 11:01 AM, Boaz Shaanan wrote:

The original posting that started this thread referred to side-chains, as the 
subject still suggests. Do you propose to omit only side-chain atoms, in which 
case you end up with different residues, as pointed out by quite a few 
people,or do you suggest also to omit the main-chain atoms of the problematic 
residues ?

Besides, as mentioned by Phoebe and others, many users (non-crystallographers) 
of PDB's know already  the meaning of the B-factor and will know how to 
interpret a very high B. It is our task (the crystallographers) to enllighten 
those who don't know what the B column in a PDB entry stands for. I certainly 
do and I'm sure many of us do so too. I voted for high B and would vote for it 
again, if asked.

 Cheers,

Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp 
(Hofkristallrat a.D.) [hofkristall...@gmail.com]
Sent: Sunday, April 03, 2011 7:42 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to model
them in, but they should be aware that they are modeling them.
Joel L. Sussman

Concur.  BMC p 680 ‘How to handle missing parts’

Best wishes, BR

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with ze

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Tommi Kajander


Hi,
it is quite possible to truncate say Lys residue isnt it? so why not  
do this, this doesn't change
the identity of the residue but precisely draws attention to the fact  
that atoms are missing due to lack

of density.

And if you click on an atom in Pymol, at least i dont see the b-factor  
displayed anywhere - i would suspect
its the same case with other mol. graphics visualization software to  
fair extent. or its some small print somewhere...
+ can you actually tell by just looking at the B-factor whether there  
is any density or not? if the wilson b
is high i suspect you can see density and the B-factor will be high  
where as if Wilson B is low same b-factor
will probably mean you dont see density at same sigma cutoff/contour  
level. or i may be wrong but suspect
this is the casewhich is why i think its better probably to  
truncate (not to ala/gly, but to truncate) them if
you don't see the density for the side chain at all. OR model the 5  
most like conformers then - or 4 - 6 ? 3?


well, this can go on forever --or rather hopefully NOT, but really i  
don't think this quite so simple as what comes to
B-factors and later analysis ---in particular if that will be in  
anyway automated and will deal with say a larger set of
coordinate files. is it really a good idea to leave an active site  
residue side chain with high B (=no density what so ever)

in, in _one_ defined conformation? i am not so dead certain...

cheers,
Tommi


On Apr 3, 2011, at 9:01 PM, Boaz Shaanan wrote:

The original posting that started this thread referred to side- 
chains, as the subject still suggests. Do you propose to omit only  
side-chain atoms, in which case you end up with different residues,  
as pointed out by quite a few people,or do you suggest also to omit  
the main-chain atoms of the problematic residues ?


Besides, as mentioned by Phoebe and others, many users (non- 
crystallographers) of PDB's know already  the meaning of the B- 
factor and will know how to interpret a very high B. It is our task  
(the crystallographers) to enllighten those who don't know what the  
B column in a PDB entry stands for. I certainly do and I'm sure many  
of us do so too. I voted for high B and would vote for it again, if  
asked.


   Cheers,

  Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of  
Bernhard Rupp (Hofkristallrat a.D.) [hofkristall...@gmail.com]

Sent: Sunday, April 03, 2011 7:42 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to  
model

them in, but they should be aware that they are modeling them.
Joel L. Sussman

Concur.  BMC p 680 ‘How to handle missing parts’

Best wishes, BR

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with zero occupancy or high B values, for example.

I am absolutely positive that there is software that does its
voodoo on ATOM/HETATM records and pays absolutely no attention
to anything beyond the x, y, z coordinates (i.e. beyond column 54).

  Frances Bernstein

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
*   ***  f...@bernstein-plus-sons.com<mailto:f...@bernstein-plus-sons.com 
>

*** *
*   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Sat, 2 Apr 2011, Jacob Keller wrote:

I guess I missed it in the flurry of replies to this thread over the
last few days, but what exactly is so terrible about keeping the atoms
(since you have chemical evidence from protein sequence that they are
there, and even if there is X-ray damage they were originally there  
and

are likely still there in a subset of the molecules), but changing
occupancy to zero as an acknowledgment that your data does not provide
evidence to support a specific atomic position for these atoms?

Some users might pull up the structure, see those atoms, and think
their positions were based on data, which they were not, and then draw
conclusions based on them. I agree that o

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Ethan Merritt

On Sunday, April 03, 2011, Jacob Keller wrote:
> To the delete-the-atom-nik's: do you propose deleting the whole
> residue or just the side chain? 

Omit the atoms beyond CB for which there is no apparent density.
Always place CB if the backbone trace is reasonable, because its
location is fixed a priori by known stereochemistry.
As a practical matter, I use Coot's "stub" command.

Ethan



> I can understand deleting the whole
> residue, but deleting only the side chain seems to me to be placing a
> stumbling block also, and even possibly confusing for an experienced
> crystallographer: the .pdb says "lys" but it looks like an ala? Which
> is it? I could imagine a lot of frustration-hours arising from this
> practice, with people cross-checking sequences, looking in the methods
> sections for mutations...
> 
> JPK
> 
> On Sun, Apr 3, 2011 at 11:42 AM, Bernhard Rupp (Hofkristallrat a.D.)
>  wrote:
> > Thus my feeling is that if one does NOT see the coords in the electron
> >
> > density, they should NOT be included, and let someone else try to model
> >
> > them in, but they should be aware that they are modeling them.
> >
> > Joel L. Sussman
> >
> >
> >
> > Concur.  BMC p 680 ‘How to handle missing parts’
> >
> >
> >
> > Best wishes, BR
> >
> >
> >
> > On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:
> >
> >
> >
> > Doing something sensible in the major software packages, both
> > for graphics and for other analysis of the structure, could
> > solve the problem for most users.
> >
> > But nobody knows what other software is out there being used by
> > individuals or small groups.  And the more remote the authors
> > of that software are from protein structure solution the more
> > likely it is that they have not/will not properly handle atoms
> > with zero occupancy or high B values, for example.
> >
> > I am absolutely positive that there is software that does its
> > voodoo on ATOM/HETATM records and pays absolutely no attention
> > to anything beyond the x, y, z coordinates (i.e. beyond column 54).
> >
> >Frances Bernstein
> >
> > =
> > Bernstein + Sons
> > *   *   Information Systems Consultants
> > 5 Brewster Lane, Bellport, NY 11713-2803
> > *   * ***
> >  *Frances C. Bernstein
> >  *   ***  f...@bernstein-plus-sons.com
> > *** *
> >  *   *** 1-631-286-1339FAX: 1-631-286-1999
> > =
> >
> > On Sat, 2 Apr 2011, Jacob Keller wrote:
> >
> > I guess I missed it in the flurry of replies to this thread over the
> >
> > last few days, but what exactly is so terrible about keeping the atoms
> >
> > (since you have chemical evidence from protein sequence that they are
> >
> > there, and even if there is X-ray damage they were originally there and
> >
> > are likely still there in a subset of the molecules), but changing
> >
> > occupancy to zero as an acknowledgment that your data does not provide
> >
> > evidence to support a specific atomic position for these atoms?
> >
> >
> >
> > Some users might pull up the structure, see those atoms, and think
> >
> > their positions were based on data, which they were not, and then draw
> >
> > conclusions based on them. I agree that occ=0 is tantamount to the
> >
> > suggestion you queried, however.
> >
> >
> >
> > A somewhat key question might be: across the various molecular
> >
> > visualization programs, what is the default way to handle atoms with
> >
> > occ=0? Perhaps those programs might be the best place to fix the
> >
> > problem...
> >
> >
> >
> > JPK
> >
> >
> >
> >
> >
> > ***
> >
> > Jacob Pearson Keller
> >
> > Northwestern University
> >
> > Medical Scientist Training Program
> >
> > cel: 773.608.9185
> >
> > email: j-kell...@northwestern.edu
> >
> > ***
> >
> >
> >
> >
> 
> 
> 
>

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Jacob Keller

To the delete-the-atom-nik's: do you propose deleting the whole
residue or just the side chain? I can understand deleting the whole
residue, but deleting only the side chain seems to me to be placing a
stumbling block also, and even possibly confusing for an experienced
crystallographer: the .pdb says "lys" but it looks like an ala? Which
is it? I could imagine a lot of frustration-hours arising from this
practice, with people cross-checking sequences, looking in the methods
sections for mutations...

JPK

On Sun, Apr 3, 2011 at 11:42 AM, Bernhard Rupp (Hofkristallrat a.D.)
 wrote:
> Thus my feeling is that if one does NOT see the coords in the electron
>
> density, they should NOT be included, and let someone else try to model
>
> them in, but they should be aware that they are modeling them.
>
> Joel L. Sussman
>
>
>
> Concur.  BMC p 680 ‘How to handle missing parts’
>
>
>
> Best wishes, BR
>
>
>
> On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:
>
>
>
> Doing something sensible in the major software packages, both
> for graphics and for other analysis of the structure, could
> solve the problem for most users.
>
> But nobody knows what other software is out there being used by
> individuals or small groups.  And the more remote the authors
> of that software are from protein structure solution the more
> likely it is that they have not/will not properly handle atoms
> with zero occupancy or high B values, for example.
>
> I am absolutely positive that there is software that does its
> voodoo on ATOM/HETATM records and pays absolutely no attention
> to anything beyond the x, y, z coordinates (i.e. beyond column 54).
>
>    Frances Bernstein
>
> =
>     Bernstein + Sons
> *   *   Information Systems Consultants
>     5 Brewster Lane, Bellport, NY 11713-2803
> *   * ***
>  *    Frances C. Bernstein
>  *   ***  f...@bernstein-plus-sons.com
> *** *
>  *   *** 1-631-286-1339    FAX: 1-631-286-1999
> =
>
> On Sat, 2 Apr 2011, Jacob Keller wrote:
>
> I guess I missed it in the flurry of replies to this thread over the
>
> last few days, but what exactly is so terrible about keeping the atoms
>
> (since you have chemical evidence from protein sequence that they are
>
> there, and even if there is X-ray damage they were originally there and
>
> are likely still there in a subset of the molecules), but changing
>
> occupancy to zero as an acknowledgment that your data does not provide
>
> evidence to support a specific atomic position for these atoms?
>
>
>
> Some users might pull up the structure, see those atoms, and think
>
> their positions were based on data, which they were not, and then draw
>
> conclusions based on them. I agree that occ=0 is tantamount to the
>
> suggestion you queried, however.
>
>
>
> A somewhat key question might be: across the various molecular
>
> visualization programs, what is the default way to handle atoms with
>
> occ=0? Perhaps those programs might be the best place to fix the
>
> problem...
>
>
>
> JPK
>
>
>
>
>
> ***
>
> Jacob Pearson Keller
>
> Northwestern University
>
> Medical Scientist Training Program
>
> cel: 773.608.9185
>
> email: j-kell...@northwestern.edu
>
> ***
>
>
>
>



-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Boaz Shaanan

The original posting that started this thread referred to side-chains, as the 
subject still suggests. Do you propose to omit only side-chain atoms, in which 
case you end up with different residues, as pointed out by quite a few 
people,or do you suggest also to omit the main-chain atoms of the problematic 
residues ?

Besides, as mentioned by Phoebe and others, many users (non-crystallographers) 
of PDB's know already  the meaning of the B-factor and will know how to 
interpret a very high B. It is our task (the crystallographers) to enllighten 
those who don't know what the B column in a PDB entry stands for. I certainly 
do and I'm sure many of us do so too. I voted for high B and would vote for it 
again, if asked.

Cheers,

   Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp 
(Hofkristallrat a.D.) [hofkristall...@gmail.com]
Sent: Sunday, April 03, 2011 7:42 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to model
them in, but they should be aware that they are modeling them.
Joel L. Sussman

Concur.  BMC p 680 ‘How to handle missing parts’

Best wishes, BR

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with zero occupancy or high B values, for example.

I am absolutely positive that there is software that does its
voodoo on ATOM/HETATM records and pays absolutely no attention
to anything beyond the x, y, z coordinates (i.e. beyond column 54).

   Frances Bernstein

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
 *   ***  f...@bernstein-plus-sons.com<mailto:f...@bernstein-plus-sons.com>
*** *
 *   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Sat, 2 Apr 2011, Jacob Keller wrote:

I guess I missed it in the flurry of replies to this thread over the
last few days, but what exactly is so terrible about keeping the atoms
(since you have chemical evidence from protein sequence that they are
there, and even if there is X-ray damage they were originally there and
are likely still there in a subset of the molecules), but changing
occupancy to zero as an acknowledgment that your data does not provide
evidence to support a specific atomic position for these atoms?

Some users might pull up the structure, see those atoms, and think
their positions were based on data, which they were not, and then draw
conclusions based on them. I agree that occ=0 is tantamount to the
suggestion you queried, however.

A somewhat key question might be: across the various molecular
visualization programs, what is the default way to handle atoms with
occ=0? Perhaps those programs might be the best place to fix the
problem...

JPK


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu<mailto:j-kell...@northwestern.edu>
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Bernhard Rupp (Hofkristallrat a.D.)

Thus my feeling is that if one does NOT see the coords in the electron

density, they should NOT be included, and let someone else try to model 

them in, but they should be aware that they are modeling them.

Joel L. Sussman

 

Concur.  BMC p 680 'How to handle missing parts'

 

Best wishes, BR

 

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

 

Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with zero occupancy or high B values, for example.

I am absolutely positive that there is software that does its
voodoo on ATOM/HETATM records and pays absolutely no attention
to anything beyond the x, y, z coordinates (i.e. beyond column 54).

   Frances Bernstein

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
 *   ***  f...@bernstein-plus-sons.com
*** *
 *   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Sat, 2 Apr 2011, Jacob Keller wrote:



I guess I missed it in the flurry of replies to this thread over the

last few days, but what exactly is so terrible about keeping the atoms

(since you have chemical evidence from protein sequence that they are

there, and even if there is X-ray damage they were originally there and

are likely still there in a subset of the molecules), but changing

occupancy to zero as an acknowledgment that your data does not provide

evidence to support a specific atomic position for these atoms?

 

Some users might pull up the structure, see those atoms, and think

their positions were based on data, which they were not, and then draw

conclusions based on them. I agree that occ=0 is tantamount to the

suggestion you queried, however.

 

A somewhat key question might be: across the various molecular

visualization programs, what is the default way to handle atoms with

occ=0? Perhaps those programs might be the best place to fix the

problem...

 

JPK

 

 

***

Jacob Pearson Keller

Northwestern University

Medical Scientist Training Program

cel: 773.608.9185

email: j-kell...@northwestern.edu

***

Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Quyen Hoang

If these users exist (I don't doubt that they do), then they would also might 
think that lysine residues sometimes look identical to alanine - if the atoms 
after beta carbon of the lysine are deleted in the PDB due to lack of density.

So, I guess, if one's objectives in solving structures are to provide these 
users with coordinates that they could us, then it would make more sense to me 
to find out what one's "customers" want, rather than speculating about it. Or 
at least, train one's "customers" how to use one's products - I believe that's 
what people in business do.

However, I think that many people solve structures for their own consumption - 
they are their own customers - therefore, it's really up to them to cook it 
anyway they find most tasteful. Others can agree or disagree, but we know that 
not everybody has the same taste.

Cheers,
Quyen


On Apr 3, 2011, at 12:54 AM, Prof. Joel L. Sussman wrote:

> I think Frances is right, i.e. most non crystallographers ignore 
> "anything beyond the x, y, z coordinates (i.e. beyond column 54)"
> [as Frances wrote]. 
> Thus if a crystallographer put in coords that he/she does NOT see,
> even with OCC=0, or an enormously large Bfactor, these coords are usually 
> treated in just the same way that experimentally observed coords are treated. 
> Thus my feeling is that if one does NOT see the coords in the electron
> density, they should NOT be included, and let someone else try to model 
> them in, but they should be aware that they are modeling them.
> Joel L. Sussman
> 
> 
> 
> On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:
> 
>> Doing something sensible in the major software packages, both
>> for graphics and for other analysis of the structure, could
>> solve the problem for most users.
>> 
>> But nobody knows what other software is out there being used by
>> individuals or small groups.  And the more remote the authors
>> of that software are from protein structure solution the more
>> likely it is that they have not/will not properly handle atoms
>> with zero occupancy or high B values, for example.
>> 
>> I am absolutely positive that there is software that does its
>> voodoo on ATOM/HETATM records and pays absolutely no attention
>> to anything beyond the x, y, z coordinates (i.e. beyond column 54).
>> 
>>Frances Bernstein
>> 
>> =
>> Bernstein + Sons
>> *   *   Information Systems Consultants
>> 5 Brewster Lane, Bellport, NY 11713-2803
>> *   * ***
>>  *Frances C. Bernstein
>>  *   ***  f...@bernstein-plus-sons.com
>> *** *
>>  *   *** 1-631-286-1339FAX: 1-631-286-1999
>> =
>> 
>> On Sat, 2 Apr 2011, Jacob Keller wrote:
>> 
 I guess I missed it in the flurry of replies to this thread over the
 last few days, but what exactly is so terrible about keeping the atoms
 (since you have chemical evidence from protein sequence that they are
 there, and even if there is X-ray damage they were originally there and
 are likely still there in a subset of the molecules), but changing
 occupancy to zero as an acknowledgment that your data does not provide
 evidence to support a specific atomic position for these atoms?
>>> 
>>> Some users might pull up the structure, see those atoms, and think
>>> their positions were based on data, which they were not, and then draw
>>> conclusions based on them. I agree that occ=0 is tantamount to the
>>> suggestion you queried, however.
>>> 
>>> A somewhat key question might be: across the various molecular
>>> visualization programs, what is the default way to handle atoms with
>>> occ=0? Perhaps those programs might be the best place to fix the
>>> problem...
>>> 
>>> JPK
>>> 
>>> 
>>> ***
>>> Jacob Pearson Keller
>>> Northwestern University
>>> Medical Scientist Training Program
>>> cel: 773.608.9185
>>> email: j-kell...@northwestern.edu
>>> ***
>>> 
>

Re: [ccp4bb] Jrh input Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread John R Helliwell

Dear Ed,
Many thanks for this careful explanation, which I appreciate.

I realise in my own practise on such matters I have two situations :-

(i) where, albeit limited, electron density evidence, coupled with
chemical evidence, leads to an attempted model atomic fit but the B
factors there sky rocket.

(ii) there is no electron density evidence and even though the
chemical evidence is sound one can in fact add nothing. Here I simply
concede that there is nothing one can do. The issue here is not
whether to keep these atoms but simply you really cannot make a start
finding them.

Greetings,
John


On Fri, Apr 1, 2011 at 4:03 PM, Ed Pozharski  wrote:
> Dear John,
>
> there may be reasons to disagree with both options.  This has been a
> recurring discussion for many years, and in my mind the most convincing
> arguments for both sides are as follows:
>
> "Keepers":
>
> I know the side chain is there and the high ADP is a good approximation
> of reality.  Removing atoms causes such a mess for the end user.
>
> "Deleters":
>
> We don't model missing loops, termini, ligands and waters when there is
> no density, and side chains should not be treated differently.  Most end
> users think ADP is a nucleotide and will over-interpret the model.
>
> I am a "keeper" when it comes to end user treatment, but a recently
> converted "deleter" when it comes to modeling (a rather stressful
> position).  So I am not taking sides really, but rather looking for a
> middle way. (Have to admit that my secret goal was to knock down the
> zero occupancy fallacy :)
>
> Perhaps these ideas are worth exploring:
>
> 1.  Provide dual representation - a crystallographic model and an
> end-user model, both downloadable from the PDB.
> 2.  Model missing side chains "NMR-way"
> 3.  A new data file format is needed (mmCIF?) that combines atomic model
> with electron density, and visualization/analysis software shall be
> modified to always utilize the experimental data
> 4.  Implement reduced ADP restraints for disordered side chains to
> further reduce model bias
>
> But ultimately, as long as experimental data is deposited, I believe
> that people are free to interpret their data the way they see fit.
> Others are then free to look at the electron density and become outraged
> at the interpretation.
>
> Cheers,
>
> Ed.
>
> On Thu, 2011-03-31 at 23:25 +0100, Jrh wrote:
>> Dear Ed,
>> Thankyou for this and apologies for late reply.
>> If one has chemical evidence for the presence of residues but these
>> residues are disordered I find the delete atoms option disagreeable.
>> Such a static disorder situation should be described by a high atomic
>> displacement parameter, in my view. (nb the use of ADP is better than
>> B factor terminology).
>> Yours sincerely,
>> John
>> Prof John R Helliwell DSc
>>
>
> --
> "I'd jump in myself, if I weren't so good at whistling."
>                               Julian, King of Lemurs
>
>



-- 
Professor John R Helliwell DSc

Re: [ccp4bb] what to do with disordered side chains

2011-04-02 Thread Prof. Joel L. Sussman

I think Frances is right, i.e. most non crystallographers ignore 
"anything beyond the x, y, z coordinates (i.e. beyond column 54)"
[as Frances wrote]. 
Thus if a crystallographer put in coords that he/she does NOT see,
even with OCC=0, or an enormously large Bfactor, these coords are usually 
treated in just the same way that experimentally observed coords are treated. 
Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to model 
them in, but they should be aware that they are modeling them.
Joel L. Sussman



On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

> Doing something sensible in the major software packages, both
> for graphics and for other analysis of the structure, could
> solve the problem for most users.
> 
> But nobody knows what other software is out there being used by
> individuals or small groups.  And the more remote the authors
> of that software are from protein structure solution the more
> likely it is that they have not/will not properly handle atoms
> with zero occupancy or high B values, for example.
> 
> I am absolutely positive that there is software that does its
> voodoo on ATOM/HETATM records and pays absolutely no attention
> to anything beyond the x, y, z coordinates (i.e. beyond column 54).
> 
>Frances Bernstein
> 
> =
> Bernstein + Sons
> *   *   Information Systems Consultants
> 5 Brewster Lane, Bellport, NY 11713-2803
> *   * ***
>  *Frances C. Bernstein
>  *   ***  f...@bernstein-plus-sons.com
> *** *
>  *   *** 1-631-286-1339FAX: 1-631-286-1999
> =
> 
> On Sat, 2 Apr 2011, Jacob Keller wrote:
> 
>>> I guess I missed it in the flurry of replies to this thread over the
>>> last few days, but what exactly is so terrible about keeping the atoms
>>> (since you have chemical evidence from protein sequence that they are
>>> there, and even if there is X-ray damage they were originally there and
>>> are likely still there in a subset of the molecules), but changing
>>> occupancy to zero as an acknowledgment that your data does not provide
>>> evidence to support a specific atomic position for these atoms?
>> 
>> Some users might pull up the structure, see those atoms, and think
>> their positions were based on data, which they were not, and then draw
>> conclusions based on them. I agree that occ=0 is tantamount to the
>> suggestion you queried, however.
>> 
>> A somewhat key question might be: across the various molecular
>> visualization programs, what is the default way to handle atoms with
>> occ=0? Perhaps those programs might be the best place to fix the
>> problem...
>> 
>> JPK
>> 
>> 
>> ***
>> Jacob Pearson Keller
>> Northwestern University
>> Medical Scientist Training Program
>> cel: 773.608.9185
>> email: j-kell...@northwestern.edu
>> ***
>>

Re: [ccp4bb] what to do with disordered side chains

2011-04-02 Thread Frances C. Bernstein


Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with zero occupancy or high B values, for example.

I am absolutely positive that there is software that does its
voodoo on ATOM/HETATM records and pays absolutely no attention
to anything beyond the x, y, z coordinates (i.e. beyond column 54).

Frances Bernstein

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
  *   ***  f...@bernstein-plus-sons.com
 *** *
  *   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Sat, 2 Apr 2011, Jacob Keller wrote:


I guess I missed it in the flurry of replies to this thread over the
last few days, but what exactly is so terrible about keeping the atoms
(since you have chemical evidence from protein sequence that they are
there, and even if there is X-ray damage they were originally there and
are likely still there in a subset of the molecules), but changing
occupancy to zero as an acknowledgment that your data does not provide
evidence to support a specific atomic position for these atoms?


Some users might pull up the structure, see those atoms, and think
their positions were based on data, which they were not, and then draw
conclusions based on them. I agree that occ=0 is tantamount to the
suggestion you queried, however.

A somewhat key question might be: across the various molecular
visualization programs, what is the default way to handle atoms with
occ=0? Perhaps those programs might be the best place to fix the
problem...

JPK


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-02 Thread Jacob Keller

> I guess I missed it in the flurry of replies to this thread over the
> last few days, but what exactly is so terrible about keeping the atoms
> (since you have chemical evidence from protein sequence that they are
> there, and even if there is X-ray damage they were originally there and
> are likely still there in a subset of the molecules), but changing
> occupancy to zero as an acknowledgment that your data does not provide
> evidence to support a specific atomic position for these atoms?

Some users might pull up the structure, see those atoms, and think
their positions were based on data, which they were not, and then draw
conclusions based on them. I agree that occ=0 is tantamount to the
suggestion you queried, however.

A somewhat key question might be: across the various molecular
visualization programs, what is the default way to handle atoms with
occ=0? Perhaps those programs might be the best place to fix the
problem...

JPK


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-04-02 Thread Radisky, Evette S., Ph.D.

>
>Just create a new tag, say _atom_site.imaginary_site, which is either
true or false for every atom.  Then everyone would be able to either
filter out the fake atoms or leave them in, without ambiguity or
confusion.
>


Aside from being a binary rather than continuous parameter, how exactly
does this suggestion differ from the occupancy column of the pdb, that
we already have?

I guess I missed it in the flurry of replies to this thread over the
last few days, but what exactly is so terrible about keeping the atoms
(since you have chemical evidence from protein sequence that they are
there, and even if there is X-ray damage they were originally there and
are likely still there in a subset of the molecules), but changing
occupancy to zero as an acknowledgment that your data does not provide
evidence to support a specific atomic position for these atoms?

Evette S. Radisky, Ph.D.
Assistant Professor
Mayo Clinic Cancer Center
Griffin Cancer Research Building, Rm 310
4500 San Pablo Road
Jacksonville, FL 32224
(904) 953-6372

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Eric Bennett

Personally I think it is a _good_ thing that those missing atoms are a pain, 
because it helps ensure you are aware of the problem.  As somebody who is in 
the business of supplying non-structural people with models, and seeing how 
those models are sometimes (mis)interpreted, I think it's better to inflict 
that pain than it is to present a model that non-structural people are likely 
to over-interpret.  

The PDB provides various manipulated versions of crystal structures, such as 
biological assemblies.  I don't think it would necessarily be a bad idea to 
build missing atoms back into those sorts of processed files but for the main 
deposited entry the best way to make sure the model is not abused is to leave 
out atoms that can't be modeled accurately.

Just as an example since you mention surfaces, some of the people I work with 
calculate solvent accessible surface areas of individual residues for purposes 
such as engineering cysteines for chemical conjugation, and if residues are 
modeled into bogus positions just to say all the atoms are there, software that 
calculates per-residue SASA has to have a reliable way of knowing to ignore 
those atoms when calculating the area of neighboring residues.  Ad hoc 
solutions like putting very large values in the B column are not clear cut for 
such a software program to interpret.  Leaving the atom out completely is 
pretty unambiguous.

-Eric

On Mar 31, 2011, at 7:34 PM, Scott Pegan wrote:

> I agree with Zbyszek with the modeling of side chains and stress the 
> following points:
> 
> 1) It drives me nuts when I find that PDB is missing atoms from side chains.  
>  This requires me to rebuild them to get any use out of the PDB such as 
> relevant surface renderings or electropotential plots.   I am an experienced 
> structural biologist so that I can immediately identify that they have been 
> removed and  can rebuild them.  I feel sorry for my fellow scientists from 
> other biological fields that can't perform this task readability, thus 
> removing these atoms from a model limits their usefulness to a wider 
> scientific audience.
> 
> 2)  Not sure if any one has documented the percentage of actual side chains 
> missing from radiation damage versus heterogeneity in confirmation (i.e. 
> dissolved a crystal after collection and sent it to Mass Spec).   Although 
> the former likely happens occasionally, my gut tells me that the latter is 
> significantly more predominant.  As a result, absence of atoms from a side 
> chain in the PDB where the main chain is clearly visible in the electron 
> density might make for the best statistics for an experimental model, but 
> does not reflect a reality.  
> 
> Scott
>

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Randy J. Read

In this case, I'm more on ZO's side. Let's say that the refinement program 
can't get an atom to the right position (for instance, to pick a reasonably 
realistic example, because you've put a leucine side chain in backwards). 
In that case, the B-factor for the atom nearest to where there should be 
one in the structure will get larger to smear out its density and put some 
in the right place. To a good approximation, the optimal increase in the 
B-factor will be the one you'd expect for a Gaussian probability 
distribution, i.e. 8Pi^2/3 times the positional error squared. So a refined 
B-factor does include a measure of the uncertainty or error in the atom's 
position.


Best wishes,

Randy Read

On Apr 1 2011, James Holton wrote:


I'm not sure I entirely agree with ZO's assessment that a B factor is
a measure of uncertainty.  Pedantically, all it really is is an
instruction to the refinement program to "build" some electron density
with a certain width and height at a certain location.  The result is
then compared to the data, parameters are adjusted, etc.  I don't
think the B factor is somehow converted into an "error bar" on the
calculated electron density, is it?

For example, a B-factor of 500 on a carbon atom just means that the
"peak" to build is ~0.02 electron/A^3 tall, and ~3 A wide (full width
at half maximum).  By comparison, a carbon with B=20 is 1.6
electrons/A^3 tall and ~0.7 A wide (FWHM).  One of the "bugs" that
Dale referred to is the fact that most refinement programs do not
"plot" electron density more than 3 A away from each atomic center, so
a substantial fraction of the 6 electrons represented by a carbon with
B=500 will be sharply "cut off", and missing from the FC calculation.
Then again, all 6 electrons will be missing if the atoms are simply
not modeled, or if the occupancy is zero.

The point I am trying to make here is that there is no B factor that
will make an atom "go away", because the way B factors are implemented
is to always conserve the total number of electrons in the atom, but
just spread them out over more space.

Now, a peak height of 0.02 electrons/A^3 may sound like it might as
well be zero, especially when sitting next to a B=20 atom, but what if
all the atoms have high B factors?  For example, if the average
(Wilson) B factor is 80 (like it typically is for a ~4A structure),
then the average peak height of a carbon atom is 0.3 electrons/A^3,
and then 0.02 electrons/A^3 starts to become more significant.  If we
consider a ~11 A structure, then the average atomic B factor will be
around 500.  This "B vs resolution" relationship is something I
derived empirically from the PDB (Holton JSR 2009).  Specifically, the
average B factor for PDB files at a given resolution "d" is: B =
4*d^2+12.  Admittedly, this is "on average", but the trend does make
physical sense: atoms with high B factors don't contribute very much
to high-angle spots.

More formally, the problem with using a high B-factor as a "flag" is
that it is not resolution-general.  Dale has already pointed this out.

Personally, I prefer to think of B factors as a atom-by-atom
"resolution" rather than an "error bar", and this is how I tell
students to interpret them (using the B = 4*d^2+12 formula).  The
problem I have with the "error bar" interpretation is that
heterogeneity and uncertainty are not the same thing.  That is, just
because the atom is "jumping around" does not mean you don't know
where the centroid of the distribution is.  The "u_x" in
B=8*pi^2* does reflect the standard error of atomic position in
a GIVEN unit cell, but since we are averaging over trillions of cells,
the "error bar" on the AVERAGE atomic position is actually a great
deal smaller than "u".  I think this distinction is important because
what we are building is a model of the AVERAGE electron density, not a
single molecule.

Just my 0.02 electrons

-James Holton
MAD Scientist



On Fri, Apr 1, 2011 at 10:57 AM, Zbyszek Otwinowski
 wrote:
The meaning of B-factor is the (scaled) sum of all positional 
uncertainties, and not just its one contributor, the Atomic Displacement 
Parameter that describes the relative displacement of an atom in the 
crystal lattice by a Gaussian function. That meaning (the sum of all 
contributions) comes from the procedure that calculates the B-factor in 
all PDB X-ray deposits, and not from an arbitrary decision by a 
committee. All programs that refine B-factors calculate an estimate of 
positional uncertainty, where contributors can be both Gaussian and 
non-Gaussian. For a non-Gaussian contributor, e.g. multiple occupancy, 
the exact numerical contribution is rather a complex function, but 
conceptually it is still an uncertainty estimate. Given the resolution 
of the typical data, we do not have a procedure to decouple Gaussian and 
non-Gaussian contributors, so we have to live with the B-factor being 
defined by the refinement procedure. However, we should still improve 
the estimates of the B-fac

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread James Holton

I'm not sure I entirely agree with ZO's assessment that a B factor is
a measure of uncertainty.  Pedantically, all it really is is an
instruction to the refinement program to "build" some electron density
with a certain width and height at a certain location.  The result is
then compared to the data, parameters are adjusted, etc.  I don't
think the B factor is somehow converted into an "error bar" on the
calculated electron density, is it?

For example, a B-factor of 500 on a carbon atom just means that the
"peak" to build is ~0.02 electron/A^3 tall, and ~3 A wide (full width
at half maximum).  By comparison, a carbon with B=20 is 1.6
electrons/A^3 tall and ~0.7 A wide (FWHM).  One of the "bugs" that
Dale referred to is the fact that most refinement programs do not
"plot" electron density more than 3 A away from each atomic center, so
a substantial fraction of the 6 electrons represented by a carbon with
B=500 will be sharply "cut off", and missing from the FC calculation.
Then again, all 6 electrons will be missing if the atoms are simply
not modeled, or if the occupancy is zero.

The point I am trying to make here is that there is no B factor that
will make an atom "go away", because the way B factors are implemented
is to always conserve the total number of electrons in the atom, but
just spread them out over more space.

Now, a peak height of 0.02 electrons/A^3 may sound like it might as
well be zero, especially when sitting next to a B=20 atom, but what if
all the atoms have high B factors?  For example, if the average
(Wilson) B factor is 80 (like it typically is for a ~4A structure),
then the average peak height of a carbon atom is 0.3 electrons/A^3,
and then 0.02 electrons/A^3 starts to become more significant.  If we
consider a ~11 A structure, then the average atomic B factor will be
around 500.  This "B vs resolution" relationship is something I
derived empirically from the PDB (Holton JSR 2009).  Specifically, the
average B factor for PDB files at a given resolution "d" is: B =
4*d^2+12.  Admittedly, this is "on average", but the trend does make
physical sense: atoms with high B factors don't contribute very much
to high-angle spots.

More formally, the problem with using a high B-factor as a "flag" is
that it is not resolution-general.  Dale has already pointed this out.

Personally, I prefer to think of B factors as a atom-by-atom
"resolution" rather than an "error bar", and this is how I tell
students to interpret them (using the B = 4*d^2+12 formula).  The
problem I have with the "error bar" interpretation is that
heterogeneity and uncertainty are not the same thing.  That is, just
because the atom is "jumping around" does not mean you don't know
where the centroid of the distribution is.  The "u_x" in
B=8*pi^2* does reflect the standard error of atomic position in
a GIVEN unit cell, but since we are averaging over trillions of cells,
the "error bar" on the AVERAGE atomic position is actually a great
deal smaller than "u".  I think this distinction is important because
what we are building is a model of the AVERAGE electron density, not a
single molecule.

Just my 0.02 electrons

-James Holton
MAD Scientist

On Fri, Apr 1, 2011 at 10:57 AM, Zbyszek Otwinowski
 wrote:
> The meaning of B-factor is the (scaled) sum of all positional
> uncertainties, and not just its one contributor, the Atomic Displacement
> Parameter that describes the relative displacement of an atom in the
> crystal lattice by a Gaussian function.
> That meaning (the sum of all contributions) comes from the procedure that
> calculates the B-factor in all PDB X-ray deposits, and not from an
> arbitrary decision by a committee. All programs that refine B-factors
> calculate an estimate of positional uncertainty, where contributors can be
> both Gaussian and non-Gaussian. For a non-Gaussian contributor, e.g.
> multiple occupancy, the exact numerical contribution is rather a complex
> function, but conceptually it is still an uncertainty estimate. Given the
> resolution of the typical data, we do not have a procedure to decouple
> Gaussian and non-Gaussian contributors, so we have to live with the
> B-factor being defined by the refinement procedure. However, we should
> still improve the estimates of the B-factor, e.g. by changing the
> restraints. In my experience, the Refmac's default restraints on B-factors
> in side chains are too tight and I adjust them. Still, my preference would
> be to have harmonic restraints on U (square root of B) rather than on Bs
> themselves.
> It is not we who cram too many meanings on the B-factor, it is the quite
> fundamental limitation of crystallographic refinement.
>
> Zbyszek Otwinowski
>
>> The fundamental problem remains:  we're cramming too many meanings into
> one number [B factor].  This the PDB could indeed solve, by giving us
> another column.  (He said airily, blithely launching a totally new flame
> war.)
>> phx.
>>
>

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Bernhard Rupp (Hofkristallrat a.D.)

> In my experience, the Refmac's default restraints on B-factors in side chains 
> are too tight and I adjust them. 

Concur. See BMC p 640.

BR

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Zbyszek Otwinowski

The meaning of B-factor is the (scaled) sum of all positional
uncertainties, and not just its one contributor, the Atomic Displacement
Parameter that describes the relative displacement of an atom in the
crystal lattice by a Gaussian function.
That meaning (the sum of all contributions) comes from the procedure that
calculates the B-factor in all PDB X-ray deposits, and not from an
arbitrary decision by a committee. All programs that refine B-factors
calculate an estimate of positional uncertainty, where contributors can be
both Gaussian and non-Gaussian. For a non-Gaussian contributor, e.g.
multiple occupancy, the exact numerical contribution is rather a complex
function, but conceptually it is still an uncertainty estimate. Given the
resolution of the typical data, we do not have a procedure to decouple
Gaussian and non-Gaussian contributors, so we have to live with the
B-factor being defined by the refinement procedure. However, we should
still improve the estimates of the B-factor, e.g. by changing the
restraints. In my experience, the Refmac's default restraints on B-factors
in side chains are too tight and I adjust them. Still, my preference would
be to have harmonic restraints on U (square root of B) rather than on Bs
themselves.
It is not we who cram too many meanings on the B-factor, it is the quite
fundamental limitation of crystallographic refinement.

Zbyszek Otwinowski

> The fundamental problem remains:  we're cramming too many meanings into
one number [B factor].  This the PDB could indeed solve, by giving us
another column.  (He said airily, blithely launching a totally new flame
war.)
> phx.
>

Re: [ccp4bb] Jrh input Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Ed Pozharski

Dear John,

there may be reasons to disagree with both options.  This has been a
recurring discussion for many years, and in my mind the most convincing
arguments for both sides are as follows:

"Keepers":

I know the side chain is there and the high ADP is a good approximation
of reality.  Removing atoms causes such a mess for the end user.

"Deleters":

We don't model missing loops, termini, ligands and waters when there is
no density, and side chains should not be treated differently.  Most end
users think ADP is a nucleotide and will over-interpret the model.

I am a "keeper" when it comes to end user treatment, but a recently
converted "deleter" when it comes to modeling (a rather stressful
position).  So I am not taking sides really, but rather looking for a
middle way. (Have to admit that my secret goal was to knock down the
zero occupancy fallacy :)

Perhaps these ideas are worth exploring:

1.  Provide dual representation - a crystallographic model and an
end-user model, both downloadable from the PDB.
2.  Model missing side chains "NMR-way"
3.  A new data file format is needed (mmCIF?) that combines atomic model
with electron density, and visualization/analysis software shall be
modified to always utilize the experimental data
4.  Implement reduced ADP restraints for disordered side chains to
further reduce model bias

But ultimately, as long as experimental data is deposited, I believe
that people are free to interpret their data the way they see fit.
Others are then free to look at the electron density and become outraged
at the interpretation.

Cheers,

Ed.

On Thu, 2011-03-31 at 23:25 +0100, Jrh wrote:
> Dear Ed,
> Thankyou for this and apologies for late reply.
> If one has chemical evidence for the presence of residues but these
> residues are disordered I find the delete atoms option disagreeable.
> Such a static disorder situation should be described by a high atomic
> displacement parameter, in my view. (nb the use of ADP is better than
> B factor terminology). 
> Yours sincerely,
> John
> Prof John R Helliwell DSc
> 

-- 
"I'd jump in myself, if I weren't so good at whistling."
   Julian, King of Lemurs

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Frank von Delft


Hi Robbie

> If it's probability you're after, if there's no density to guide you
> (very common!) you'd have to place all "likely" rotamers that don't
> clash with anything, and set their occupancies to their probability (as
> encoded in the rotamer library).
Which library? The one for all side chains of a specific type, or the 
one for a specific type with a given backbone conformation? These are 
quite different and change with the content of the PDB.
'Hacking' the occupancies is risky bussiness in general: errors are 
made quite easily. I frequently encounter side chains with partial 
occupancies but no alternatives, how can I relate this to the 
experimental date? Even worse, I also see cases where the occupancies 
of alternates sum up to values > 1.00. What does that mean? Is that a 
local increase of DarmMatter accidentally encoded in the occupancy?
Actually, I wasn't advocating it - I was taking ZO's suggestion to it's 
logical conclusion to point out the problem, namely deciding what is 
"most likely".  This you underline with your (very valid) question.




> Until the PDB is expanded, the conventions need to be clear, and I
> thought they were:
> High B-factor ==> means atom is there but density is weak
> Atom missing ==> no density to support it.
Unfortunately, it is not trivial to decide when there is 'no density'. 
We must have a good metric to do this, but I don't think it exists 
yet. Removing atoms is thus very subjective. This explaines why I 
frequently find positive difference density peaks near missing side 
chains. Leaving side chains in sometimes gives negative difference 
density but refining them with proper B-factor restrainsts reduces the 
problem a lot. There is still the problem of radiation damage, but 
that is relatively small. At least refining the B-factor is more 
reproducible and less subjective than making the binary choice to keep 
or remove an atom.

(Radiation damage is NOT a "relatively small" problem.)

The fundamental problem remains:  we're cramming too many meanings into 
one number.  This the PDB could indeed solve, by giving us another 
column.  (He said airily, blithely launching a totally new flame war.)


phx.

Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Quyen Hoang

Dear Gerard,

I agree with you based on debates at some conferences.

But, based on what I have seen here so far, it seems to me that everybody knows 
exactly what to do with disordered side chains.
People that want to build structures to best fit the data tend to prefer 
omitting disordered side chains. On the other hand, people that want to build 
structures to best represent reality tend to prefer building them. I don't see 
any disagreement here nor do I see any problems with either approach. Different 
people collect the same data to study different things and I feel that they are 
entitle to view and interpret the data the way that they fine most meaningful. 

Equations are attempts to describe reality, I don't see why we should constrain 
reality to fit equations. 

Cheers,
Quyen


On Mar 31, 2011, at 12:21 PM, Gerard Bricogne wrote:

> Dear Quyen,
> 
> On Thu, Mar 31, 2011 at 11:27:58AM -0400, Quyen Hoang wrote:
>> Thank you for your post, Herman.
>> Since there is no holy bible to provide guidance, perhaps we should hold 
>> off the idea of electing a "powerful dictator" to enforce this?
>> - at least until we all can come to a consensus on how the "dictator" 
>> should dictate...
>> 
> 
> ... but that might well be even harder than to decide what to do with
> disordered side chains ... .
> 
> 
> With best wishes,
> 
>  Gerard.
> 
> --
> 
> ===
> * *
> * Gerard Bricogne g...@globalphasing.com  *
> * *
> * Global Phasing Ltd. *
> * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
> * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
> * *
> ===
>> 
>> 
>> On Mar 31, 2011, at 10:22 AM, herman.schreu...@sanofi-aventis.com wrote:
>> 
>>> Dear Quyen,
>>> I am afraid you won't get any better answers than you got so far. There is 
>>> no holy bible telling you what to do with disordered side chains. I fully 
>>> agree with James that you should try to get the best possible model, which 
>>> best explains your data and that will be your decision. Here are my 2 
>>> cents:
>>> 
>>> -If you see alternative positions, you have to build them.
>>> -If you do not see alternative positions, I would not replace one fantasy 
>>> (some call it most likely) orientation with 2 or 3 fantasy orientations.
>>> -I personally belong to the "let the B-factors take care of it" camp, but 
>>> that is my personal opinion. Leaving side chains out could lead to 
>>> misinterpretations by slightly less savy users of our data, especially 
>>> when charge distributions are being studied. Besides, we know (almost) for 
>>> sure that the side chain is there, it is only disordered and as we just 
>>> learned, even slightly less savy users know what flaming red side chains 
>>> mean. Even if they may not be mathematically entirely correct, huge 
>>> B-factors clearly indicate that there is disorder involved.
>>> -I would not let occupancies take up the slack since even very savy users 
>>> have never heard of them and again, the side chain is fully occupied, only 
>>> disordered. Of course if you build alternate positions, you have to divede 
>>> the occupancies amongst them.
>>> 
>>> Best,
>>> Herman
>>> 
>>> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
>>> Quyen Hoang
>>> Sent: Thursday, March 31, 2011 3:55 PM
>>> To: CCP4BB@JISCMAIL.AC.UK
>>> Subject: Re: [ccp4bb] what to do with disordered side chains
>>> 
>>> We are getting off topic a little bit.
>>> 
>>> Original topic: is it better to not build disordered sidechains or build 
>>> them and let B-factors take care of it?
>>> Ed's poll got almost a 50:50 split.
>>> Question still unanswered.
>>> 
>>> Second topic introduced by Pavel: "Your B-factors are valid within a 
>>> harmonic (small) approximation of atomic vibrations. Larger scale motions 
>>> you are talking about go beyond the harmonic approximation, and using the 
>>> B-factor to model them is abusing the corresponding mathematical model."
>>> And that these large scale motions (disorders) a

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-04-01 Thread Robbie Joosten


Hi Frank,

> > I described in the previous e-mail the probabilistic interpretation of
> > B-factors. In the case of very high uncertainty = poorly ordered side
> > chains, I prefer to deposit the conformer representing maximum a
> > posteriori, even if it does not represent all possible conformations.
> > Maximum a posteriori will have significant contribution from the most
> > probable conformation of side chain (prior knowledge) and should not
> > conflict with likelihood (electron density map).
> > Thus, in practice I model the most probable conformation as long as it
> > it in even very weak electron density, does not overlap significantly
> > with negative difference electron density and do not clash with other
> > residues.
> If it's probability you're after, if there's no density to guide you 
> (very common!) you'd have to place all "likely" rotamers that don't 
> clash with anything, and set their occupancies to their probability (as 
> encoded in the rotamer library).
Which library? The one for all side chains of a specific type, or the one for a 
specific type with a given backbone conformation? These are quite different and 
change with the content of the PDB.
'Hacking' the occupancies is risky bussiness in general: errors are made quite 
easily. I frequently encounter side chains with partial occupancies but no 
alternatives, how can I relate this to the experimental date? Even worse, I 
also see cases where the occupancies of alternates sum up to values > 1.00. 
What does that mean? Is that a local increase of DarmMatter accidentally 
encoded in the occupancy?

> This is now veering into data-free protein modeling territory... wasn't 
> the idea to present to the downstream user an atomic representation of 
> what the electron density shows us?
Yes, but what we see can be deceiving.

> Worse, what we're also doing is encoding multiple different things in 
> one place - what database people call "poorly normalised", i.e. to 
> understand a data field requires further parsing and if statements. In 
> this case: to know whether there was no density, as end-user I'd have 
> to have to second-guess what exactly those 
> high-B-factor-variable-occupancy atoms mean.
> 
> Until the PDB is expanded, the conventions need to be clear, and I 
> thought they were:
> High B-factor ==> means atom is there but density is weak
> Atom missing ==> no density to support it.
Unfortunately, it is not trivial to decide when there is 'no density'. We must 
have a good metric to do this, but I don't think it exists yet. Removing atoms 
is thus very subjective. This explaines why I frequently find positive 
difference density peaks near missing side chains. Leaving side chains in 
sometimes gives negative difference density but refining them with proper 
B-factor restrainsts reduces the problem a lot. There is still the problem of 
radiation damage, but that is relatively small. At least refining the B-factor 
is more reproducible and less subjective than making the binary choice to keep 
or remove an atom.
 
Cheers,
Robbie

> 
> Oh well...
> phx.

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Bernhard Rupp (Hofkristallrat a.D.)

Dear All,

just in time for the heated discussion of missing density, high B-factors, and 
split conformations I have a paper in the most recent Phys Rev Letters that 
provides a rational explanation for the observation of split conformations when 
the equilibrium density becomes weak.

The first page is available from my web site,

http://www.ruppweb.org/select/Rupp_2011_Phys_Rev_Letters_160_13_B-factor.pdf

but for copyright reasons I ask that you please email me for the entire paper.

Comments are welcome,

BR
-
Bernhard Rupp
001 (925) 209-7429
+43 (676) 571-0536
b...@ruppweb.org
hofkristall...@gmail.com
http://www.ruppweb.org/
-
People can be divided in three classes:
The few who make things happen
The many who watch things happen
And the overwhelming majority
who have no idea what is happening.
-

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Frank von Delft


On 31/03/2011 23:43, Zbyszek Otwinowski wrote:

Regarding the closing statement about the best solution to poorly
ordered side chains:

I described in the previous e-mail the probabilistic interpretation of
B-factors. In the case of very high uncertainty = poorly ordered side
chains, I prefer to deposit the conformer representing maximum a
posteriori, even if it does not represent all possible conformations.
Maximum a posteriori will have significant contribution from the most
probable conformation of side chain (prior knowledge) and should not
conflict with likelihood (electron density map).
Thus, in practice I model the most probable conformation as long as it
it in even very weak electron density, does not overlap significantly
with negative difference electron density and do not clash with other
residues.
If it's probability you're after, if there's no density to guide you 
(very common!) you'd have to place all "likely" rotamers that don't 
clash with anything, and set their occupancies to their probability (as 
encoded in the rotamer library).


This is now veering into data-free protein modeling territory... wasn't 
the idea to present to the downstream user an atomic representation of 
what the electron density shows us?


Worse, what we're also doing is encoding multiple different things in 
one place - what database people call "poorly normalised", i.e. to 
understand a data field requires further parsing and if statements.  In 
this case:  to know whether there was no density, as end-user I'd have 
to have to second-guess what exactly those 
high-B-factor-variable-occupancy atoms mean.


Until the PDB is expanded, the conventions need to be clear, and I 
thought they were:

High B-factor ==> means atom is there but density is weak
Atom missing ==> no density to support it.

Oh well...
phx.

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Scott Pegan

I agree with Zbyszek with the modeling of side chains and stress the
following points:

1) It drives me nuts when I find that PDB is missing atoms from side
chains.   This requires me to rebuild them to get any use out of the PDB
such as relevant surface renderings or electropotential plots.   I am an
experienced structural biologist so that I can immediately identify that
they have been removed and  can rebuild them.  I feel sorry for my fellow
scientists from other biological fields that can't perform this task
readability, thus removing these atoms from a model limits their usefulness
to a wider scientific audience.

2)  Not sure if any one has documented the percentage of actual side chains
missing from radiation damage versus heterogeneity in confirmation (i.e.
dissolved a crystal after collection and sent it to Mass Spec).   Although
the former likely happens occasionally, my gut tells me that the latter is
significantly more predominant.  As a result, absence of atoms from a side
chain in the PDB where the main chain is clearly visible in the electron
density might make for the best statistics for an experimental model, but
does not reflect a reality.

Scott




On Thu, Mar 31, 2011 at 4:43 PM, Zbyszek Otwinowski
wrote:

> Regarding the closing statement about the best solution to poorly ordered
> side chains:
>
> I described in the previous e-mail the probabilistic interpretation of
> B-factors. In the case of very high uncertainty = poorly ordered side
> chains, I prefer to deposit the conformer representing maximum a posteriori,
> even if it does not represent all possible conformations.
> Maximum a posteriori will have significant contribution from the most
> probable conformation of side chain (prior knowledge) and should not
> conflict with likelihood (electron density map).
> Thus, in practice I model the most probable conformation as long as it it
> in even very weak electron density, does not overlap significantly with
> negative difference electron density and do not clash with other residues.
>
> As a user of PDB files I much prefer the simplest and the most informative
> representation of the result. Removing parts of side chains that carry
> charges, as already mentioned, is not particularly helpful for the
> downstream uses. NMR-like deposits are not among my favorites, either.
> Having multiple conformations with low occupancies increases potential for a
> confusion, while benefits are not clear to me.
>
> Zbyszek
>
> Frank von Delft wrote:
>
>> This is a lovely summary, and we should make our students read it. - But
>> I'm afraid I do not see how it supports the closing statement in the last
>> paragraph... phx.
>>
>>
>> On 31/03/2011 17:06, Zbyszek Otwinowski wrote:
>>
>>> The B-factor in crystallography represents the convolution (sum) of two
>>> types of uncertainties about the atom (electron cloud) position:
>>>
>>> 1) dispersion of atom positions in crystal lattice
>>> 2) uncertainty of the experimenter's knowledge  about the atom position.
>>>
>>> In general, uncertainty needs not to be described by Gaussian function.
>>> However, communicating uncertainty using the second moment of its
>>> distribution is a widely accepted practice, with frequently implied
>>> meaning that it corresponds to a Gaussian probability function. B-factor
>>> is simply a scaled (by 8 times pi squared) second moment of uncertainty
>>> distribution.
>>>
>>> In the previous, long thread, confusion was generated by the additional
>>> assumption that B-factor also corresponds to a Gaussian probability
>>> distribution and not just to a second moment of any probability
>>> distribution. Crystallographic literature often implies the Gaussian
>>> shape, so there is some justification for such an interpretation, where
>>> the more complex probability distribution is represented by the sum of
>>> displaced Gaussians, where the area under each Gaussian component
>>> corresponds to the occupancy of an alternative conformation.
>>>
>>> For data with a typical resolution for macromolecular crystallography,
>>> such multi-Gaussian description of the atom position's uncertainty is not
>>> practical, as it would lead to instability in the refinement and/or
>>> overfitting. Due to this, a simplified description of the atom's position
>>> uncertainty by just the second moment of probability distribution is the
>>> right approach. For this reason, the PDB format is highly suitable for
>>> the
>>> description of positional uncertainties,  the only difference with other
>>> fields being the unusual form of squaring and then scaling up the
>>> standard
>>> uncertainty. As this calculation can be easily inverted, there is no loss
>>> of information. However, in teaching one should probably stress more this
>>> unusual form of presenting the standard deviation.
>>>
>>> A separate issue is the use of restraints on B-factor values, a subject
>>> that probably needs a longer discussion.
>>>
>>> With respect to the previous thre

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Zbyszek Otwinowski

Regarding the closing statement about the best solution to poorly 
ordered side chains:


I described in the previous e-mail the probabilistic interpretation of 
B-factors. In the case of very high uncertainty = poorly ordered side 
chains, I prefer to deposit the conformer representing maximum a 
posteriori, even if it does not represent all possible conformations.
Maximum a posteriori will have significant contribution from the most 
probable conformation of side chain (prior knowledge) and should not 
conflict with likelihood (electron density map).
Thus, in practice I model the most probable conformation as long as it 
it in even very weak electron density, does not overlap significantly 
with negative difference electron density and do not clash with other 
residues.


As a user of PDB files I much prefer the simplest and the most 
informative representation of the result. Removing parts of side chains 
that carry charges, as already mentioned, is not particularly helpful 
for the downstream uses. NMR-like deposits are not among my favorites, 
either. Having multiple conformations with low occupancies increases 
potential for a confusion, while benefits are not clear to me.


Zbyszek

Frank von Delft wrote:
This is a lovely summary, and we should make our students read it. - But 
I'm afraid I do not see how it supports the closing statement in the 
last paragraph... phx.



On 31/03/2011 17:06, Zbyszek Otwinowski wrote:

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for 
the

description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the 
standard

uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski




- they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions about
H-bonds
to flaming red side chains.

But this "knowledge" may be quite wrong.  If the flaming red really
indicates
large vibrational motion then yes, one whould not bet on stable 
H-bonds.

But if the flaming red indicates that a well-ordered sidechain was
incorrectly
modeled at full occupancy when in fact it is only present at
half-occupancy
then no, the H-bond could be strong but only present in that
half-occupancy
conformation.  One presumes that the other half-occupancy location
(perhaps
missing from the model) would have its own H-bonding network.

I beg to differ.  If a side chain has 2 or more positions, one should 
be a
bit careful about making firm conclusions based on only one of those, 
even

if it isn't clear exactly why one should use caution.  Also, isn't the
isotropic B we fit at "medium" resolution more of a "spherical cow"
approximation to physical reality anyway?

   Phoebe





Zbyszek Otwinowski
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.
Dallas, TX 75390-8816
Tel. 214-645-6385
Fax. 214-645-6353





--
Zbysz

[ccp4bb] Jrh input Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Jrh

Dear Ed,
Thankyou for this and apologies for late reply.
If one has chemical evidence for the presence of residues but these residues 
are disordered I find the delete atoms option disagreeable. Such a static 
disorder situation should be described by a high atomic displacement parameter, 
in my view. (nb the use of ADP is better than B factor terminology). 
Yours sincerely,
John
Prof John R Helliwell DSc


On 29 Mar 2011, at 22:43, Ed Pozharski  wrote:

> The results of the online survey on what to do with disordered side
> chains (from total of 240 responses):
> 
> Delete the atoms 43%
> Let refinement take care of it by inflating B-factors41%
> Set occupancy to zero12%
> Other 4%
> 
> "Other" suggestions were:
> 
> - Place atoms in most likely spot based on rotomer and contacts and
> indicate high positional sigmas on ATMSIG records
> - To invent refinement that will spread this residues over many rotamers
> as this is what actually happened
> - Delet the atoms but retain the original amino acid name
> - choose the most common rotamer (B-factors don't "inflate", they just
> rise slightly)
> - Depends. if the disordered region is unteresting, delete atoms.
> Otherwise, try to model it in one or more disordered model (and then
> state it clearly in the pdb file)
> - In case that no density is in the map, model several conformations of
> the missing segment and insert it into the PDB file with zero
> occupancies. It is equivalent what the NMR people do. 
> - Model it in and compare the MD simulations with SAXS
> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
> - Let the refinement inflate B-factors, then set occupancy to zero in
> the last round.
> 
> Thanks to all for participation,
> 
> Ed.
> 
> -- 
> "I'd jump in myself, if I weren't so good at whistling."
>   Julian, King of Lemurs

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Sanishvili, Ruslan

Hi All,
Notwithstanding the stimulating discussion about the B-factor, I'd like
to chime in with my $0.02 on the original question of to build or not to
build and what are the rules and standards... and sorry for the lengthy
e-mail - I was trying to respond to several comments at once.

I thought there was a very well-defined rule: models based on
experimental data should represent the experimental data correctly. If a
model has parts that are not substantiated by experimental data and are
based only on assumptions, it's no longer an experimental model. Based
on this, one should leave out the atoms for which there is no observable
electron density. And one need not say that "we were unable to build a
model of a missing side chains (or any other segments of the structure).
There is also no need to guess or fake "most probable" conformations of
the unobserved parts. Instead, it should be reported that the segment in
question was so flexible that it could not be described by just one or
two (and may be three) conformers. As such, this observation stands on
its own feet just like any other observation of "visible" segments and
there is no need to fake a model. If the work was done properly, a model
with missing parts is not intrinsically inferior to other, more complete
model. The fact that a side chain displays flexibility may be
biologically much more relevant than some well-defined Ile in the core
of the molecule. Omitting unobserved side chains from the model would
also help avoid assumptions as if we know for sure that the side chain
is there. Given side chain actually may not be there for some reason or
another. Sequence errors and radiation-induced damage come to mind, for
example. The latter is also often the reason that the side chain may not
be fully occupied in the structure derived from a specific data set
(i.e. the sum of occupancies of all its existing conformations may not
be 1, contrary to earlier suggestions in the thread). Back in the day I
personally spent large amounts of time and effort constructing and
refining multi-conformational models of some side chains because I was
sure they were there somewhere. Later on, as we learned more, I realized
that some of them have been sheered by radiation damage and actually
were not there. As knowledge advances, many of our assumptions may
crumble and that's why we ought to keep "experimentally visible" models
apart from those with assumed parts.

As for the downstream consumers of our models, we may not need to
confuse them with strange B factors or occupancies. We just need to give
them correct information. Namely, that the given part(s) of the molecule
could not be "seen" experimentally due to its flexibility (or, in some
cases, to radiation damage).  There was an interesting suggestion of two
models - one accurately describing the experimental observations and the
other for the downstream users. It would be a good way to separate Sci
from Fi but there may be a problem. When theories are derived further
downstream, it'll be impossible to keep track of what came from Sci and
what came from Fi versions.

Best regards,
N.

Ruslan Sanishvili (Nukri), Ph.D.

GM/CA-CAT
Biosciences Division, ANL
9700 S. Cass Ave.
Argonne, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
Dale Tronrud
Sent: Thursday, March 31, 2011 4:51 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

On 3/31/2011 12:52 PM, Jacob Keller wrote:
>> The only advantage of a large, positive, number is that it would
create
>> bugs that are more subtle.
>
> Although most of the users on this BB probably know more about the
> software coding, I am surprised that bugs--even subtle ones--would be
> introduced by residues flagged with 0 occupancy and b-factor = 500.
> Can you elaborate/enumerate?

The principle problems with defining a particular value of the B
factor
as magical have already been listed in this thread.  B factors are
usually
restrained to the values of the atoms their atom is bonded to and
sometimes to other atoms they pack against.  You may set the B factor
equal to 500.00 but it will not stick.  At worst its presence will pull
up the B factors of nearby atoms that do have density.

In addition, the only refinement program I know of that takes
occupancy
into account when restraining bond lengths and angles is CNX.  The
presence
of atoms with occ=0 will affect the location of atoms they share
geometry
restraints with.

Of course you could modify the refinement programs, and every other
program that reads crystallographic models, to deal with your
redefinition
of _atom_site.B_iso_or_equiv.  In fact you would have to, just as you
would
have to when you change the definition of any of the parameters in our
models.  If we ha

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Zbyszek Otwinowski


Dale Tronrud wrote:

   While what you say here is quite true and is useful for us to
remember, your list is quite short.  I can add another

3) The systematic error introduced by assuming full occupancy for all 
sites.


You are right that structural heterogeneity is an additional factor.
Se-Met expression is one of the examples where the Se-Met residue is 
often not fully incorporated, and therefore its side chains have mixed 
with Met composition.

Obviously, solvent molecules may have partial occupancies.
Also, in heavily exposed crystals chemical reactions result in loss of 
the functional groups (e.g. by decarboxylation).
However, in most cases even if side chains have multiple conformations 
their total occupancy is 1.0.




There are, of course, many other factors that we don't account for
that our refinement programs tend to dump into the B factors.

   The definition of that number in the PDB file, as listed in the mmCIF
dictionary, only includes your first factor --

http://mmcif.rcsb.org/dictionaries/mmcif_std.dic/Items/_atom_site.B_iso_or_equiv.html 



and these numbers are routinely interpreted as though that definition is
the law.  Certainly the whole basis of TLS refinement is that the B factors
are really Atomic Displacement Parameters.   In addition the stereochemical
restraints on B factors are derived from the assumption that these 
parameters

are ADPs.  Convoluting all these other factors with the ADPs causes serious
problems for those who analyze B factors as measures of motion.

   The fact that current refinement programs mix all these factors with the
ADP for an atom to produce a vaguely defined "B factor" should be 
considered
a flaw to be corrected and not an opportunity to pile even more factors 
into

this field in the PDB file.



B-factors describe overall uncertainty of the current model. Refinement 
programs, which do not introduce or remove parts of the model (e.g. are 
not able to add additional conformations) intrinsically pile up all 
uncertainties into B-factors. Solutions, which you would like to see 
implemented, require a model-building like approach. The test of the 
success of such approach would be a substantial decrease of R-free 
values. If anybody can show it, it would be great.


Zbyszek


Dale Tronrud






On 3/31/2011 9:06 AM, Zbyszek Otwinowski wrote:

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for 
the

description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the 
standard

uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski




- they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions about
H-bonds
to flaming red side chains.


But this "knowledge" may be quite wrong.  If the flaming red really
indicates
large vibrational motion then yes, one whould not

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud


On 3/31/2011 12:52 PM, Jacob Keller wrote:

The only advantage of a large, positive, number is that it would create
bugs that are more subtle.


Although most of the users on this BB probably know more about the
software coding, I am surprised that bugs--even subtle ones--would be
introduced by residues flagged with 0 occupancy and b-factor = 500.
Can you elaborate/enumerate?


   The principle problems with defining a particular value of the B factor
as magical have already been listed in this thread.  B factors are usually
restrained to the values of the atoms their atom is bonded to and
sometimes to other atoms they pack against.  You may set the B factor
equal to 500.00 but it will not stick.  At worst its presence will pull
up the B factors of nearby atoms that do have density.

   In addition, the only refinement program I know of that takes occupancy
into account when restraining bond lengths and angles is CNX.  The presence
of atoms with occ=0 will affect the location of atoms they share geometry
restraints with.

   Of course you could modify the refinement programs, and every other
program that reads crystallographic models, to deal with your redefinition
of _atom_site.B_iso_or_equiv.  In fact you would have to, just as you would
have to when you change the definition of any of the parameters in our
models.  If we have to modify code, why not create a solution that is
explicit, clear, and as consistent with previous practices as possible?


I think that the worst that could happen is that the unexperienced yet
b-factor-savvy user would be astonished by the huge b-factors, even if
he did not realize they were flags. At best, being surprised at the
precise number 500, he would look into the pdb file and see occupancy
= zero, google it, and learn something new about crystallography.


   How about positive difference map peaks on neighboring atoms?  How
about values for B factors that don't relate to the mean square motion
of the atom, despite that being the direct definition of the B factor?

   The concept of an "unexperienced yet b-factor-savvy user" is amusing.
I'm not b-factor-savvy.  Atomic displacement values are easy, but I'm
learning new subtleties about B factors all the time.




   The fundamental problem with your solution is that you are trying to
cram two pieces of information into a single number.  Such density always
causes problems.  Each concept needs its own value.


What two pieces of information into what single number? Occupancy = 0
tells you that the atom cannot be modelled, and B=500 is merely a flag
for same, and always goes with occ=0. What is so dense? On the
contrary, I think the info is redundant if anything...


   To be honest I had forgotten that you were proposing that the occupancy
be set to zero at the same time.  Besides putting two pieces of information
in the B factor column (The B factor's value and a flag for "imaginaryness".)
You do the same for occupancy (the occupancy's value and a flag for
"imaginaryness".)  This violates another rule of data structures - that
each concept be stored in one, and only one, place.  How do you interpret
an atom with an occupancy of zero but a B factor of 250?  How about an
atom with a B factor of 500.00 and an occupancy of 1.00?  Now we have the
confusing situation that the B factor can only be interpreted in the
context of the the value of the occupancy and vice versa.  Database-savvy
people (and I'm not one of them either) are not going to like this.

   If you want to calculate the average B factor for a model, certainly
those atoms with their B factor = 500 should not be included.  However,
I gather we do need to include those equal to 500 if their occupancy is
not equal to 0.0.  This is a mess.  In a database application we can't
simply SELECT the row with the B factors and average them.  We have to
SELECT both the B factor and occupancy rows and perform some really
weird "if" statements element by element - just to calculate an average!
What should be a simple task becomes very complex.  Will a graduate
student code the calculation correctly?  Probably not.  They will likely
not recall all the complicated interpretations of special values your
convention would require.

   Now consider this.  Refinement is running along and the occupancy for
an atom happens to overshoot and, in the middle of refinement, assumes
a value of 0.00.  There is positive difference density the next cycle.
(I did say that it overshot.)  Should the refinement program interpret
that Occ=0.00 to mean that the atom is imaginary and should not be
considered as part of the crystallographic model?  Wouldn't it be bad
if the atom suddenly disappeared because of a fluctuation?  Or should
the refinement program use one definition of "occupancy" during
refinement, but write a PDB file occupancy that has a different definition?
(It might be relevant to this line of thought to recall that the TNT
refinement package writes each intermediate coordinate file t

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Jacob Keller

> The only advantage of a large, positive, number is that it would create
> bugs that are more subtle.

Although most of the users on this BB probably know more about the
software coding, I am surprised that bugs--even subtle ones--would be
introduced by residues flagged with 0 occupancy and b-factor = 500.
Can you elaborate/enumerate?

I think that the worst that could happen is that the unexperienced yet
b-factor-savvy user would be astonished by the huge b-factors, even if
he did not realize they were flags. At best, being surprised at the
precise number 500, he would look into the pdb file and see occupancy
= zero, google it, and learn something new about crystallography.


>   The fundamental problem with your solution is that you are trying to
> cram two pieces of information into a single number.  Such density always
> causes problems.  Each concept needs its own value.

What two pieces of information into what single number? Occupancy = 0
tells you that the atom cannot be modelled, and B=500 is merely a flag
for same, and always goes with occ=0. What is so dense? On the
contrary, I think the info is redundant if anything...


> either.  You can't out-think someone who's not paying attention.  At
> some point you have to assume that people being paid to perform research
> will learn the basics of the data they are using, even if you know that
> assumption is not 100% true.

Well, the problem is not *should* but *do*. Should we print bilingual
danger signs in the US? Shouldn't we assume that people know English?
But there is danger, and we care about sparing lives. Here too, if we
care about the truth being abused or missed, it seems we should go out
of our way.

JPK

-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud

On 3/31/2011 10:14 AM, Jacob Keller wrote:
>> What do we gain? As Dale pointed out, we are already abusing either occupancy, B-factor or delete the side chain to compensate 
for our inability to tell the user that the side chain is disordered. With your proposal, we would fudge both occupancy and 
B-factor, which in my eyes is even worse as fudging just one of the two.

>
>
> We gain clarity to the non-crystallographer user: a b-factor of 278.9
> sounds like possibly something real. A b-factor of exactly 1000 does
> not. Both probably have the same believability, viz., ~zero. Also,
> setting occupancy = zero is not fudging but rather respectfully
> declining to comment based on lack of data. I think it is exactly the
> same as omitting residues one cannot see in the density.
>

   These things are never clear unless there is a solid definition of
the terms you are using.  I don't think you can come up with an "out of
band" value for the B factor that doesn't have a legitimate meaning as
an atomic displacement parameter for someone.  How large a B factor you
can meaningfully define depends on your lower resolution limit.  People
working with electron microscopy or small angle X-ray scattering could
easily build models with ADPs far larger than anything we normally
encounter.

   In addition, you can't define "1000" as a magic value since the PDB
format will only allow values up to 999.99, and I presume maintaining
the PDB format is one of your goals.  Of course, you could choose -99.99
as the magic value but that would break all of our existing software
and I presume you don't want that either.  Actually defining any value
for the B factor as the magic value would break all of our software.
The only advantage of a large, positive, number is that it would create
bugs that are more subtle.

   The fundamental problem with your solution is that you are trying to
cram two pieces of information into a single number.  Such density always
causes problems.  Each concept needs its own value.

   You could implement your solution easily in mmCIF.  Just create a new
tag, say _atom_site.imaginary_site, which is either true or false
for every atom.  Then everyone would be able to either filter out the fake
atoms or leave them in, without ambiguity or confusion.

   If you object that the naive user of structural models wouldn't know
to check this tag - they aren't going to know about your magic B factor
either.  You can't out-think someone who's not paying attention.  At
some point you have to assume that people being paid to perform research
will learn the basics of the data they are using, even if you know that
assumption is not 100% true.

Dale Tronrud

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud


   While what you say here is quite true and is useful for us to
remember, your list is quite short.  I can add another

3) The systematic error introduced by assuming full occupancy for all sites.

There are, of course, many other factors that we don't account for
that our refinement programs tend to dump into the B factors.

   The definition of that number in the PDB file, as listed in the mmCIF
dictionary, only includes your first factor --

http://mmcif.rcsb.org/dictionaries/mmcif_std.dic/Items/_atom_site.B_iso_or_equiv.html

and these numbers are routinely interpreted as though that definition is
the law.  Certainly the whole basis of TLS refinement is that the B factors
are really Atomic Displacement Parameters.   In addition the stereochemical
restraints on B factors are derived from the assumption that these parameters
are ADPs.  Convoluting all these other factors with the ADPs causes serious
problems for those who analyze B factors as measures of motion.

   The fact that current refinement programs mix all these factors with the
ADP for an atom to produce a vaguely defined "B factor" should be considered
a flaw to be corrected and not an opportunity to pile even more factors into
this field in the PDB file.

Dale Tronrud


On 3/31/2011 9:06 AM, Zbyszek Otwinowski wrote:

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for the
description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the standard
uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski




- they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions about
H-bonds
to flaming red side chains.


But this "knowledge" may be quite wrong.  If the flaming red really
indicates
large vibrational motion then yes, one whould not bet on stable H-bonds.
But if the flaming red indicates that a well-ordered sidechain was
incorrectly
modeled at full occupancy when in fact it is only present at
half-occupancy
then no, the H-bond could be strong but only present in that
half-occupancy
conformation.  One presumes that the other half-occupancy location
(perhaps
missing from the model) would have its own H-bonding network.



I beg to differ.  If a side chain has 2 or more positions, one should be a
bit careful about making firm conclusions based on only one of those, even
if it isn't clear exactly why one should use caution.  Also, isn't the
isotropic B we fit at "medium" resolution more of a "spherical cow"
approximation to physical reality anyway?

   Phoebe






Zbyszek Otwinowski
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.
Dallas, TX 75390-8816
Tel. 214-645-6385
Fax. 214-645-6353

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Ethan Merritt

On Thursday, March 31, 2011 10:05:22 am Hailiang Zhang wrote:
> Dear Zbyszek:
> 
> Thanks a lot for your good summary. It is very interesting but, do you
> think there are some references for more detailed description, especially
> from mathematics point of view about correlating B-factor to the Gaussian
> probability distribution (the B-factor unit of A^2 is my first doubt as
> for the probability distribution description)? Thanks again for your
> efforts!
> 
> Best Regards, Hailiang

I already cited the IUCr standard once, but here it is again:
 Trueblood, et al, 1996; Acta Cryst. A52, 770-781 
 http://dx.doi.org/10.1107/S0108767396005697



-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Nat Echols

On Thu, Mar 31, 2011 at 10:14 AM, Jacob Keller <
j-kell...@fsm.northwestern.edu> wrote:

>  Also,
> setting occupancy = zero is not fudging but rather respectfully
> declining to comment based on lack of data. I think it is exactly the
> same as omitting residues one cannot see in the density.

No, it's not the same.  If you have placed any atoms, even with zero
occupancy, you have said something about where you expect the atoms to be,
or at least where the refinement program thinks they should be.  "Declining
to comment" would be deleting them, not guessing.

I think a reasonable number could be derived and agreed upon, and
> would not be surprised if there is such a derivation or analysis in
> the literature answering the question:
>
> "At what b-factor does modelling an atom become insignificant with
> respect to explaining/predicting/fitting the data?"
>
> That point would be the b-factor/occupancy cutoff.
>

Although atoms with very high B-factors may have almost no impact on
F(calc), if the occupancy is non-zero they will still be driven by gradients
with respect to X-ray data, and their positions (or changes thereof) will in
turn affect other atoms, through geometry restraints if not F(calc).  So
there is no point at which these atoms cease to be relevant to the task of
fitting.

-Nat

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Jacob Keller

> What do we gain? As Dale pointed out, we are already abusing either 
> occupancy, B-factor or delete the side chain to compensate for our inability 
> to tell the user that the side chain is disordered. With your proposal, we 
> would fudge both occupancy and B-factor, which in my eyes is even worse as 
> fudging just one of the two.


We gain clarity to the non-crystallographer user: a b-factor of 278.9
sounds like possibly something real. A b-factor of exactly 1000 does
not. Both probably have the same believability, viz., ~zero. Also,
setting occupancy = zero is not fudging but rather respectfully
declining to comment based on lack of data. I think it is exactly the
same as omitting residues one cannot see in the density.


> Also, who should decide on the magic number: the all-knowing gurus at the 
> protein data bank? Maybe we should really start using cif files, which allow 
> to specify coordinate uncertainties.


I think a reasonable number could be derived and agreed upon, and
would not be surprised if there is such a derivation or analysis in
the literature answering the question:

"At what b-factor does modelling an atom become insignificant with
respect to explaining/predicting/fitting the data?"

That point would be the b-factor/occupancy cutoff.

JPK

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Hailiang Zhang

Dear Zbyszek:

Thanks a lot for your good summary. It is very interesting but, do you
think there are some references for more detailed description, especially
from mathematics point of view about correlating B-factor to the Gaussian
probability distribution (the B-factor unit of A^2 is my first doubt as
for the probability distribution description)? Thanks again for your
efforts!

Best Regards, Hailiang


> The B-factor in crystallography represents the convolution (sum) of two
> types of uncertainties about the atom (electron cloud) position:
>
> 1) dispersion of atom positions in crystal lattice
> 2) uncertainty of the experimenter's knowledge  about the atom position.
>
> In general, uncertainty needs not to be described by Gaussian function.
> However, communicating uncertainty using the second moment of its
> distribution is a widely accepted practice, with frequently implied
> meaning that it corresponds to a Gaussian probability function. B-factor
> is simply a scaled (by 8 times pi squared) second moment of uncertainty
> distribution.
>
> In the previous, long thread, confusion was generated by the additional
> assumption that B-factor also corresponds to a Gaussian probability
> distribution and not just to a second moment of any probability
> distribution. Crystallographic literature often implies the Gaussian
> shape, so there is some justification for such an interpretation, where
> the more complex probability distribution is represented by the sum of
> displaced Gaussians, where the area under each Gaussian component
> corresponds to the occupancy of an alternative conformation.
>
> For data with a typical resolution for macromolecular crystallography,
> such multi-Gaussian description of the atom position's uncertainty is not
> practical, as it would lead to instability in the refinement and/or
> overfitting. Due to this, a simplified description of the atom's position
> uncertainty by just the second moment of probability distribution is the
> right approach. For this reason, the PDB format is highly suitable for the
> description of positional uncertainties,  the only difference with other
> fields being the unusual form of squaring and then scaling up the standard
> uncertainty. As this calculation can be easily inverted, there is no loss
> of information. However, in teaching one should probably stress more this
> unusual form of presenting the standard deviation.
>
> A separate issue is the use of restraints on B-factor values, a subject
> that probably needs a longer discussion.
>
> With respect to the previous thread, representing poorly-ordered (so
> called 'disordered') side chains by the most likely conformer with
> appropriately high B-factors is fully justifiable, and currently is
> probably the best solution to a difficult problem.
>
> Zbyszek Otwinowski
>
>
>
 - they all know what B is and how to look for regions of high B
 (with, say, pymol) and they know not to make firm conclusions about
 H-bonds
 to flaming red side chains.
>>>
>>>But this "knowledge" may be quite wrong.  If the flaming red really
>>> indicates
>>>large vibrational motion then yes, one whould not bet on stable H-bonds.
>>>But if the flaming red indicates that a well-ordered sidechain was
>>> incorrectly
>>>modeled at full occupancy when in fact it is only present at
>>> half-occupancy
>>>then no, the H-bond could be strong but only present in that
>>> half-occupancy
>>>conformation.  One presumes that the other half-occupancy location
>>> (perhaps
>>>missing from the model) would have its own H-bonding network.
>>>
>>
>> I beg to differ.  If a side chain has 2 or more positions, one should be
>> a
>> bit careful about making firm conclusions based on only one of those,
>> even
>> if it isn't clear exactly why one should use caution.  Also, isn't the
>> isotropic B we fit at "medium" resolution more of a "spherical cow"
>> approximation to physical reality anyway?
>>
>>   Phoebe
>>
>>
>>
>
>
> Zbyszek Otwinowski
> UT Southwestern Medical Center at Dallas
> 5323 Harry Hines Blvd.
> Dallas, TX 75390-8816
> Tel. 214-645-6385
> Fax. 214-645-6353
>
>

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Frank von Delft

This is a lovely summary, and we should make our students read it. - But 
I'm afraid I do not see how it supports the closing statement in the 
last paragraph... phx.



On 31/03/2011 17:06, Zbyszek Otwinowski wrote:

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for the
description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the standard
uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski




- they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions about
H-bonds
to flaming red side chains.

But this "knowledge" may be quite wrong.  If the flaming red really
indicates
large vibrational motion then yes, one whould not bet on stable H-bonds.
But if the flaming red indicates that a well-ordered sidechain was
incorrectly
modeled at full occupancy when in fact it is only present at
half-occupancy
then no, the H-bond could be strong but only present in that
half-occupancy
conformation.  One presumes that the other half-occupancy location
(perhaps
missing from the model) would have its own H-bonding network.


I beg to differ.  If a side chain has 2 or more positions, one should be a
bit careful about making firm conclusions based on only one of those, even
if it isn't clear exactly why one should use caution.  Also, isn't the
isotropic B we fit at "medium" resolution more of a "spherical cow"
approximation to physical reality anyway?

   Phoebe





Zbyszek Otwinowski
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.
Dallas, TX 75390-8816
Tel. 214-645-6385
Fax. 214-645-6353

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Gerard Bricogne

Dear Quyen,

On Thu, Mar 31, 2011 at 11:27:58AM -0400, Quyen Hoang wrote:
> Thank you for your post, Herman.
> Since there is no holy bible to provide guidance, perhaps we should hold 
> off the idea of electing a "powerful dictator" to enforce this?
> - at least until we all can come to a consensus on how the "dictator" 
> should dictate...
>

 ... but that might well be even harder than to decide what to do with
disordered side chains ... .


 With best wishes,
 
  Gerard.

--

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===
>
>
> On Mar 31, 2011, at 10:22 AM, herman.schreu...@sanofi-aventis.com wrote:
>
>> Dear Quyen,
>> I am afraid you won't get any better answers than you got so far. There is 
>> no holy bible telling you what to do with disordered side chains. I fully 
>> agree with James that you should try to get the best possible model, which 
>> best explains your data and that will be your decision. Here are my 2 
>> cents:
>>
>> -If you see alternative positions, you have to build them.
>> -If you do not see alternative positions, I would not replace one fantasy 
>> (some call it most likely) orientation with 2 or 3 fantasy orientations.
>> -I personally belong to the "let the B-factors take care of it" camp, but 
>> that is my personal opinion. Leaving side chains out could lead to 
>> misinterpretations by slightly less savy users of our data, especially 
>> when charge distributions are being studied. Besides, we know (almost) for 
>> sure that the side chain is there, it is only disordered and as we just 
>> learned, even slightly less savy users know what flaming red side chains 
>> mean. Even if they may not be mathematically entirely correct, huge 
>> B-factors clearly indicate that there is disorder involved.
>> -I would not let occupancies take up the slack since even very savy users 
>> have never heard of them and again, the side chain is fully occupied, only 
>> disordered. Of course if you build alternate positions, you have to divede 
>> the occupancies amongst them.
>>
>> Best,
>> Herman
>>
>> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
>> Quyen Hoang
>> Sent: Thursday, March 31, 2011 3:55 PM
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: Re: [ccp4bb] what to do with disordered side chains
>>
>> We are getting off topic a little bit.
>>
>> Original topic: is it better to not build disordered sidechains or build 
>> them and let B-factors take care of it?
>> Ed's poll got almost a 50:50 split.
>> Question still unanswered.
>>
>> Second topic introduced by Pavel: "Your B-factors are valid within a 
>> harmonic (small) approximation of atomic vibrations. Larger scale motions 
>> you are talking about go beyond the harmonic approximation, and using the 
>> B-factor to model them is abusing the corresponding mathematical model."
>> And that these large scale motions (disorders) are better represented by 
>> "alternative conformations and associated with them occupancies".
>>
>> My question is, how many people here do this?
>> If you're currently doing what Pavel suggested here, how do you decide 
>> where to keep the upper limit of B-factors and what the occupancies are 
>> for each atom (data with resolution of 2.0A or worse)? I mean, do you cap 
>> the B-factor at a reasonable number to represent natural atomic vibrations 
>> (which is very small as Pavel pointed out) and then let the occupancies 
>> pick up the slack? More importantly, what is your reason for doing this?
>>
>> Cheers and thanks for your contribution,
>> Quyen
>>
>>
>> On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:
>>
>>> Mark,
>>> alternative conformations and associated with them occupancies are to 
>>> describe the larger scale disorder (the one that goes beyond the 
>>> B-factor's capability to cope with).
>>> Multi-model PDB files is another option.
>>> Best,
>>> Pavel.
>>>
>>

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Roberts, Sue A - (suer)

Regarding suggestions that the pdb or the IUCR to tell us what to do:

IMO -

Neither of the usual solutions - (a) deleting side chains when there is no 
density or (b) letting B factors go where they will - is without problems (this 
is clear from the ongoing discussion).  I would be really unhappy if some 
authority unilaterally imposed either of these solutions on the protein 
crystallographic community.

Sue


Dr. Sue A. Roberts
Dept. of Chemistry and Biochemistry
University of Arizona
1041 E. Lowell St.,  Tucson, AZ 85721
Phone: 520 621 8171
s...@email.arizona.edu
http://www.biochem.arizona.edu/xray

[ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Zbyszek Otwinowski

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for the
description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the standard
uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski



>>> - they all know what B is and how to look for regions of high B
>>> (with, say, pymol) and they know not to make firm conclusions about
>>> H-bonds
>>> to flaming red side chains.
>>
>>But this "knowledge" may be quite wrong.  If the flaming red really
>> indicates
>>large vibrational motion then yes, one whould not bet on stable H-bonds.
>>But if the flaming red indicates that a well-ordered sidechain was
>> incorrectly
>>modeled at full occupancy when in fact it is only present at
>> half-occupancy
>>then no, the H-bond could be strong but only present in that
>> half-occupancy
>>conformation.  One presumes that the other half-occupancy location
>> (perhaps
>>missing from the model) would have its own H-bonding network.
>>
>
> I beg to differ.  If a side chain has 2 or more positions, one should be a
> bit careful about making firm conclusions based on only one of those, even
> if it isn't clear exactly why one should use caution.  Also, isn't the
> isotropic B we fit at "medium" resolution more of a "spherical cow"
> approximation to physical reality anyway?
>
>   Phoebe
>
>
>


Zbyszek Otwinowski
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.
Dallas, TX 75390-8816
Tel. 214-645-6385
Fax. 214-645-6353

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Peter Keller

On Thu, 2011-03-31 at 11:27 -0400, Quyen Hoang wrote:
> Thank you for your post, Herman.
> Since there is no holy bible to provide guidance, perhaps we should
> hold off the idea of electing a "powerful dictator" to enforce this?
> - at least until we all can come to a consensus on how the "dictator"
> should dictate...

Well, that is partly what we have the IUCr for, isn't it? A couple of
people have referred to the wwPDB in this context, but IMHO the IUCr is
a much better forum to try to reach some kind of decision about these
issues. If the IUCr has a clear policy, the wwPDB can enforce it (like
they did with the deposition of structure factors). If the wwPDB takes a
lead a lot of people will get annoyed at them, when the real problem is
that crystallographic practitioners haven't come to any agreement
amongst themselves.

And yes, I know that balancing questions of scientific correctness with
the needs of more or less naive consumers of the data isn't
straightforward :-)

Regards,
Peter.

-- 
Peter Keller Tel.: +44 (0)1223 353033
Global Phasing Ltd., Fax.: +44 (0)1223 366889
Sheraton House,
Castle Park,
Cambridge CB3 0AX
United Kingdom

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Ed Pozharski

On Thu, 2011-03-31 at 17:04 +0200, herman.schreu...@sanofi-aventis.com
wrote:
> Maybe we should really start using cif files, which allow to specify
> coordinate uncertainties.

PDB has SIGATM record for that purpose

-- 
"I'd jump in myself, if I weren't so good at whistling."
   Julian, King of Lemurs

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Quyen Hoang

Thank you for your post, Herman.
Since there is no holy bible to provide guidance, perhaps we should  
hold off the idea of electing a "powerful dictator" to enforce this?
- at least until we all can come to a consensus on how the "dictator"  
should dictate...

Cheers,
Quyen

On Mar 31, 2011, at 10:22 AM, herman.schreu...@sanofi-aventis.com wrote:

Dear Quyen,
I am afraid you won't get any better answers than you got so far.  
There is no holy bible telling you what to do with disordered side  
chains. I fully agree with James that you should try to get the best  
possible model, which best explains your data and that will be your  
decision. Here are my 2 cents:

-If you see alternative positions, you have to build them.
-If you do not see alternative positions, I would not replace one  
fantasy (some call it most likely) orientation with 2 or 3 fantasy  
orientations.
-I personally belong to the "let the B-factors take care of it"  
camp, but that is my personal opinion. Leaving side chains out could  
lead to misinterpretations by slightly less savy users of our data,  
especially when charge distributions are being studied. Besides, we  
know (almost) for sure that the side chain is there, it is only  
disordered and as we just learned, even slightly less savy users  
know what flaming red side chains mean. Even if they may not be  
mathematically entirely correct, huge B-factors clearly indicate  
that there is disorder involved.
-I would not let occupancies take up the slack since even very savy  
users have never heard of them and again, the side chain is fully  
occupied, only disordered. Of course if you build alternate  
positions, you have to divede the occupancies amongst them.

Best,
Herman

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf  
Of Quyen Hoang

Sent: Thursday, March 31, 2011 3:55 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

We are getting off topic a little bit.

Original topic: is it better to not build disordered sidechains or  
build them and let B-factors take care of it?

Ed's poll got almost a 50:50 split.
Question still unanswered.

Second topic introduced by Pavel: "Your B-factors are valid within a  
harmonic (small) approximation of atomic vibrations. Larger scale  
motions you are talking about go beyond the harmonic approximation,  
and using the B-factor to model them is abusing the corresponding  
mathematical model."
And that these large scale motions (disorders) are better  
represented by "alternative conformations and associated with them  
occupancies".

My question is, how many people here do this?
If you're currently doing what Pavel suggested here, how do you  
decide where to keep the upper limit of B-factors and what the  
occupancies are for each atom (data with resolution of 2.0A or  
worse)? I mean, do you cap the B-factor at a reasonable number to  
represent natural atomic vibrations (which is very small as Pavel  
pointed out) and then let the occupancies pick up the slack? More  
importantly, what is your reason for doing this?

Cheers and thanks for your contribution,
Quyen

On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:

Mark,
alternative conformations and associated with them occupancies are  
to describe the larger scale disorder (the one that goes beyond the  
B-factor's capability to cope with).

Multi-model PDB files is another option.
Best,
Pavel.

On Wed, Mar 30, 2011 at 2:15 PM, VAN RAAIJ , MARK JOHAN > wrote:
yet, apart from (and additionally to) modelling two conformations  
of the side-chain, the B-factor is the only tool we have (now).

Quoting Pavel Afonine:

> Hi  Quyen,
>
>
> (...) And if B-factor is an estimate of thermo-motion (or static  
disorder),
>> then would it not be reasonable to accept that building the side- 
chain and
>> let B-factor sky rocket might reflect reality more so than not  
building it?

>>
>
> NO.  Your B-factors are valid within a harmonic (small)  
approximation of
> atomic vibrations. Larger scale motions you are talking about go  
beyond the
> harmonic approximation, and using the B-factor to model them is  
abusing the

> corresponding mathematical model.
> http://www.phenix-online.org/newsletter/CCN_2010_07.pdf
>
> Pavel.
>

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoléculas
Centro Nacional de Biotecnología - CSIC

c/Darwin 3, Campus Cantoblanco
28049 Madrid
tel. 91 585 4616
email: mjvanra...@cnb.csic.es

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Herman . Schreuder

Dear Jacob,

What do we gain? As Dale pointed out, we are already abusing either occupancy, 
B-factor or delete the side chain to compensate for our inability to tell the 
user that the side chain is disordered. With your proposal, we would fudge both 
occupancy and B-factor, which in my eyes is even worse as fudging just one of 
the two. 

Also, who should decide on the magic number: the all-knowing gurus at the 
protein data bank? Maybe we should really start using cif files, which allow to 
specify coordinate uncertainties.

Best regards,
Herman

 

-Original Message-
From: Jacob Keller [mailto:j-kell...@fsm.northwestern.edu] 
Sent: Thursday, March 31, 2011 4:43 PM
To: Schreuder, Herman R&D/DE
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] what to do with disordered side chains

Why not have the "b-factors take care of it" until some magic cutoff number? 
When they reach the cutoff, two things happen:

1. Occupancies are set to zero for those side chains, to represent our lack of 
ability to model the region,

2. B-factors are set to exactly 500, as a "flag" allowing casual b-factor-savvy 
users to identify suspicious regions, since they will probably not see 
occupancies, but *will* see b-factors. Therefore, all 0-occupancy atoms will 
automatically have b-factors = 500. I believe it is true that if the 
occupancies are zero, the b-factors are totally irrelevant for all calculations?

Doesn't this satisfy both parties?

Jacob





On Thu, Mar 31, 2011 at 9:22 AM,   wrote:
> Dear Quyen,
> I am afraid you won't get any better answers than you got so far. 
> There is no holy bible telling you what to do with disordered side 
> chains. I fully agree with James that you should try to get the best 
> possible model, which best explains your data and that will be your decision. 
> Here are my 2 cents:
>
> -If you see alternative positions, you have to build them.
> -If you do not see alternative positions, I would not replace one 
> fantasy (some call it most likely) orientation with 2 or 3 fantasy 
> orientations.
> -I personally belong to the "let the B-factors take care of it" camp, 
> but that is my personal opinion. Leaving side chains out could lead to 
> misinterpretations by slightly less savy users of our data, especially 
> when charge distributions are being studied. Besides, we know (almost) 
> for sure that the side chain is there, it is only disordered and as we 
> just learned, even slightly less savy users know what flaming red side 
> chains mean. Even if they may not be mathematically entirely correct, 
> huge B-factors clearly indicate that there is disorder involved.
> -I would not let occupancies take up the slack since even very savy 
> users have never heard of them and again, the side chain is fully 
> occupied, only disordered. Of course if you build alternate positions, 
> you have to divede the occupancies amongst them.
>
> Best,
> Herman
>
> 
> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
> Quyen Hoang
> Sent: Thursday, March 31, 2011 3:55 PM
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] what to do with disordered side chains
>
> We are getting off topic a little bit.
> Original topic: is it better to not build disordered sidechains or 
> build them and let B-factors take care of it?
> Ed's poll got almost a 50:50 split.
> Question still unanswered.
> Second topic introduced by Pavel: "Your B-factors are valid within a 
> harmonic (small) approximation of atomic vibrations. Larger scale 
> motions you are talking about go beyond the harmonic approximation, 
> and using the B-factor to model them is abusing the corresponding 
> mathematical model."
> And that these large scale motions (disorders) are better represented 
> by "alternative conformations and associated with them occupancies".
> My question is, how many people here do this?
> If you're currently doing what Pavel suggested here, how do you decide 
> where to keep the upper limit of B-factors and what the occupancies 
> are for each atom (data with resolution of 2.0A or worse)? I mean, do 
> you cap the B-factor at a reasonable number to represent natural 
> atomic vibrations (which is very small as Pavel pointed out) and then 
> let the occupancies pick up the slack? More importantly, what is your reason 
> for doing this?
> Cheers and thanks for your contribution, Quyen
>
> On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:
>
> Mark,
> alternative conformations and associated with them occupancies are to 
> describe the larger scale disorder (the one that goes beyond the 
> B-factor's capability to cope with).
> Multi-model PDB files is another option.
> Best,
> Pavel.
>
>

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Jacob Keller

Well, I guess I was thinking to make the b-factor such a preposterous
value that no one would possibly believe it. Setting occupancies to
zero effectively places a stumbling block, because people see the
residues and think they are actually supported by data. So to
counter-balance this, I thought putting up a high-b-factor flag would
prevent people from tripping over the stumbling block. Look, you could
even set the b-factor to 1 if you want--just something so people
totally discount those coordinates.

Jacob

On Thu, Mar 31, 2011 at 9:55 AM, Nat Echols  wrote:
> On Thu, Mar 31, 2011 at 7:42 AM, Jacob Keller
>  wrote:
>>
>> Why not have the "b-factors take care of it" until some magic cutoff
>> number? When they reach the cutoff, two things happen:
>>
>> 1. Occupancies are set to zero for those side chains, to represent our
>> lack of ability to model the region,
>>
>> 2. B-factors are set to exactly 500, as a "flag" allowing casual
>> b-factor-savvy users to identify suspicious regions, since they will
>> probably not see occupancies, but *will* see b-factors. Therefore, all
>> 0-occupancy atoms will automatically have b-factors = 500. I believe
>> it is true that if the occupancies are zero, the b-factors are totally
>> irrelevant for all calculations?
>>
>> Doesn't this satisfy both parties?
>
> No, because now you're not only presenting the user with made-up
> coordinates, you're giving them a made-up B-factor as well, so there is
> effectively no property of those atoms that is based on experimental data
> rather than subjective criteria.  Regardless of any  problems inherent in
> letting the B-factors take care of all forms of disorder, they are
> nonetheless a refined parameter.
> -Nat



-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Phoebe Rice

>> - they all know what B is and how to look for regions of high B 
>> (with, say, pymol) and they know not to make firm conclusions about H-bonds 
>> to flaming red side chains.
>
>But this "knowledge" may be quite wrong.  If the flaming red really indicates 
>large vibrational motion then yes, one whould not bet on stable H-bonds.
>But if the flaming red indicates that a well-ordered sidechain was incorrectly
>modeled at full occupancy when in fact it is only present at half-occupancy
>then no, the H-bond could be strong but only present in that half-occupancy
>conformation.  One presumes that the other half-occupancy location (perhaps
>missing from the model) would have its own H-bonding network.
>

I beg to differ.  If a side chain has 2 or more positions, one should be a bit 
careful about making firm conclusions based on only one of those, even if it 
isn't clear exactly why one should use caution.  Also, isn't the isotropic B we 
fit at "medium" resolution more of a "spherical cow" approximation to physical 
reality anyway?

  Phoebe

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Nat Echols

On Thu, Mar 31, 2011 at 7:42 AM, Jacob Keller <
j-kell...@fsm.northwestern.edu> wrote:

> Why not have the "b-factors take care of it" until some magic cutoff
> number? When they reach the cutoff, two things happen:
>
> 1. Occupancies are set to zero for those side chains, to represent our
> lack of ability to model the region,
>
> 2. B-factors are set to exactly 500, as a "flag" allowing casual
> b-factor-savvy users to identify suspicious regions, since they will
> probably not see occupancies, but *will* see b-factors. Therefore, all
> 0-occupancy atoms will automatically have b-factors = 500. I believe
> it is true that if the occupancies are zero, the b-factors are totally
> irrelevant for all calculations?
>
> Doesn't this satisfy both parties?

No, because now you're not only presenting the user with made-up
coordinates, you're giving them a made-up B-factor as well, so there is
effectively no property of those atoms that is based on experimental data
rather than subjective criteria.  Regardless of any  problems inherent in
letting the B-factors take care of all forms of disorder, they are
nonetheless a refined parameter.

-Nat

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Jacob Keller

Why not have the "b-factors take care of it" until some magic cutoff
number? When they reach the cutoff, two things happen:

1. Occupancies are set to zero for those side chains, to represent our
lack of ability to model the region,

2. B-factors are set to exactly 500, as a "flag" allowing casual
b-factor-savvy users to identify suspicious regions, since they will
probably not see occupancies, but *will* see b-factors. Therefore, all
0-occupancy atoms will automatically have b-factors = 500. I believe
it is true that if the occupancies are zero, the b-factors are totally
irrelevant for all calculations?

Doesn't this satisfy both parties?

Jacob





On Thu, Mar 31, 2011 at 9:22 AM,   wrote:
> Dear Quyen,
> I am afraid you won't get any better answers than you got so far. There is
> no holy bible telling you what to do with disordered side chains. I fully
> agree with James that you should try to get the best possible model, which
> best explains your data and that will be your decision. Here are my 2 cents:
>
> -If you see alternative positions, you have to build them.
> -If you do not see alternative positions, I would not replace one fantasy
> (some call it most likely) orientation with 2 or 3 fantasy orientations.
> -I personally belong to the "let the B-factors take care of it" camp, but
> that is my personal opinion. Leaving side chains out could lead to
> misinterpretations by slightly less savy users of our data, especially when
> charge distributions are being studied. Besides, we know (almost) for sure
> that the side chain is there, it is only disordered and as we just learned,
> even slightly less savy users know what flaming red side chains mean. Even
> if they may not be mathematically entirely correct, huge B-factors clearly
> indicate that there is disorder involved.
> -I would not let occupancies take up the slack since even very savy users
> have never heard of them and again, the side chain is fully occupied, only
> disordered. Of course if you build alternate positions, you have to divede
> the occupancies amongst them.
>
> Best,
> Herman
>
> 
> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Quyen
> Hoang
> Sent: Thursday, March 31, 2011 3:55 PM
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] what to do with disordered side chains
>
> We are getting off topic a little bit.
> Original topic: is it better to not build disordered sidechains or build
> them and let B-factors take care of it?
> Ed's poll got almost a 50:50 split.
> Question still unanswered.
> Second topic introduced by Pavel: "Your B-factors are valid within a
> harmonic (small) approximation of atomic vibrations. Larger scale motions
> you are talking about go beyond the harmonic approximation, and using the
> B-factor to model them is abusing the corresponding mathematical model."
> And that these large scale motions (disorders) are better represented by
> "alternative conformations and associated with them occupancies".
> My question is, how many people here do this?
> If you're currently doing what Pavel suggested here, how do you decide where
> to keep the upper limit of B-factors and what the occupancies are for each
> atom (data with resolution of 2.0A or worse)? I mean, do you cap the
> B-factor at a reasonable number to represent natural atomic vibrations
> (which is very small as Pavel pointed out) and then let the occupancies pick
> up the slack? More importantly, what is your reason for doing this?
> Cheers and thanks for your contribution,
> Quyen
>
> On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:
>
> Mark,
> alternative conformations and associated with them occupancies are to
> describe the larger scale disorder (the one that goes beyond the B-factor's
> capability to cope with).
> Multi-model PDB files is another option.
> Best,
> Pavel.
>
>
> On Wed, Mar 30, 2011 at 2:15 PM, VAN RAAIJ , MARK JOHAN
>  wrote:
>>
>> yet, apart from (and additionally to) modelling two conformations of the
>> side-chain, the B-factor is the only tool we have (now).
>> Quoting Pavel Afonine:
>>
>> > Hi  Quyen,
>> >
>> >
>> > (...) And if B-factor is an estimate of thermo-motion (or static
>> > disorder),
>> >> then would it not be reasonable to accept that building the side-chain
>> >> and
>> >> let B-factor sky rocket might reflect reality more so than not building
>> >> it?
>> >>
>> >
>> > NO.  Your B-factors are valid within a harmonic (small) approximation of
>> > atomic vibrations. Larger scale motions you are talking about go b

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Herman . Schreuder

Dear Quyen,
I am afraid you won't get any better answers than you got so far. There is no 
holy bible telling you what to do with disordered side chains. I fully agree 
with James that you should try to get the best possible model, which best 
explains your data and that will be your decision. Here are my 2 cents:

-If you see alternative positions, you have to build them. 
-If you do not see alternative positions, I would not replace one fantasy (some 
call it most likely) orientation with 2 or 3 fantasy orientations.
-I personally belong to the "let the B-factors take care of it" camp, but that 
is my personal opinion. Leaving side chains out could lead to 
misinterpretations by slightly less savy users of our data, especially when 
charge distributions are being studied. Besides, we know (almost) for sure that 
the side chain is there, it is only disordered and as we just learned, even 
slightly less savy users know what flaming red side chains mean. Even if they 
may not be mathematically entirely correct, huge B-factors clearly indicate 
that there is disorder involved.
-I would not let occupancies take up the slack since even very savy users have 
never heard of them and again, the side chain is fully occupied, only 
disordered. Of course if you build alternate positions, you have to divede the 
occupancies amongst them.

Best,
Herman

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
Quyen Hoang
Sent: Thursday, March 31, 2011 3:55 PM
To: CCP4BB@JISCMAIL.AC.UK
    Subject: Re: [ccp4bb] what to do with disordered side chains

We are getting off topic a little bit.

Original topic: is it better to not build disordered sidechains or 
build them and let B-factors take care of it?
Ed's poll got almost a 50:50 split.
Question still unanswered.

Second topic introduced by Pavel: "Your B-factors are valid within a 
harmonic (small) approximation of atomic vibrations. Larger scale motions you 
are talking about go beyond the harmonic approximation, and using the B-factor 
to model them is abusing the corresponding mathematical model." 
And that these large scale motions (disorders) are better represented 
by "alternative conformations and associated with them occupancies".

My question is, how many people here do this?
If you're currently doing what Pavel suggested here, how do you decide 
where to keep the upper limit of B-factors and what the occupancies are for 
each atom (data with resolution of 2.0A or worse)? I mean, do you cap the 
B-factor at a reasonable number to represent natural atomic vibrations (which 
is very small as Pavel pointed out) and then let the occupancies pick up the 
slack? More importantly, what is your reason for doing this?

Cheers and thanks for your contribution,
Quyen

On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:

Mark, 
alternative conformations and associated with them occupancies 
are to describe the larger scale disorder (the one that goes beyond the 
B-factor's capability to cope with). 
Multi-model PDB files is another option.
Best,
Pavel.

On Wed, Mar 30, 2011 at 2:15 PM, VAN RAAIJ , MARK JOHAN 
 wrote:

yet, apart from (and additionally to) modelling two 
conformations of the side-chain, the B-factor is the only tool we have (now). 

Quoting Pavel Afonine:

> Hi  Quyen,
>
>
> (...) And if B-factor is an estimate of thermo-motion 
(or static disorder),
>> then would it not be reasonable to accept that 
building the side-chain and
>> let B-factor sky rocket might reflect reality more 
so than not building it?
>>
>
> NO.  Your B-factors are valid within a harmonic 
(small) approximation of
> atomic vibrations. Larger scale motions you are 
talking about go beyond the
> harmonic approximation, and using the B-factor to 
model them is abusing the
> corresponding mathematical model.
> 
http://www.phenix-online.org/newsletter/CCN_2010_07.pdf
>
> Pavel.
>

Mark J van Raaij
Laboratorio M-4

Dpto de Estructura de Macromoléculas

Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Quyen Hoang

We are getting off topic a little bit.

Original topic: is it better to not build disordered sidechains or  
build them and let B-factors take care of it?

Ed's poll got almost a 50:50 split.
Question still unanswered.

Second topic introduced by Pavel: "Your B-factors are valid within a  
harmonic (small) approximation of atomic vibrations. Larger scale  
motions you are talking about go beyond the harmonic approximation,  
and using the B-factor to model them is abusing the corresponding  
mathematical model."
And that these large scale motions (disorders) are better represented  
by "alternative conformations and associated with them occupancies".

My question is, how many people here do this?
If you're currently doing what Pavel suggested here, how do you decide  
where to keep the upper limit of B-factors and what the occupancies  
are for each atom (data with resolution of 2.0A or worse)? I mean, do  
you cap the B-factor at a reasonable number to represent natural  
atomic vibrations (which is very small as Pavel pointed out) and then  
let the occupancies pick up the slack? More importantly, what is your  
reason for doing this?

Cheers and thanks for your contribution,
Quyen

On Mar 30, 2011, at 5:20 PM, Pavel Afonine wrote:

Mark,
alternative conformations and associated with them occupancies are  
to describe the larger scale disorder (the one that goes beyond the  
B-factor's capability to cope with).

Multi-model PDB files is another option.
Best,
Pavel.

On Wed, Mar 30, 2011 at 2:15 PM, VAN RAAIJ , MARK JOHAN > wrote:
yet, apart from (and additionally to) modelling two conformations of  
the side-chain, the B-factor is the only tool we have (now).

Quoting Pavel Afonine:

> Hi  Quyen,
>
>
> (...) And if B-factor is an estimate of thermo-motion (or static  
disorder),
>> then would it not be reasonable to accept that building the side- 
chain and
>> let B-factor sky rocket might reflect reality more so than not  
building it?

>>
>
> NO.  Your B-factors are valid within a harmonic (small)  
approximation of
> atomic vibrations. Larger scale motions you are talking about go  
beyond the
> harmonic approximation, and using the B-factor to model them is  
abusing the

> corresponding mathematical model.
> http://www.phenix-online.org/newsletter/CCN_2010_07.pdf
>
> Pavel.
>

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoléculas
Centro Nacional de Biotecnología - CSIC

c/Darwin 3, Campus Cantoblanco
28049 Madrid
tel. 91 585 4616
email: mjvanra...@cnb.csic.es

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Pavel Afonine

Mark,
alternative conformations and associated with them occupancies are to
describe the larger scale disorder (the one that goes beyond the B-factor's
capability to cope with).
Multi-model PDB files is another option.
Best,
Pavel.


On Wed, Mar 30, 2011 at 2:15 PM, VAN RAAIJ , MARK JOHAN <
mjvanra...@cnb.csic.es> wrote:

> yet, apart from (and additionally to) modelling two conformations of the
> side-chain, the B-factor is the only tool we have (now).
>
> Quoting Pavel Afonine:
>
> > Hi  Quyen,
> >
> >
> > (...) And if B-factor is an estimate of thermo-motion (or static
> disorder),
> >> then would it not be reasonable to accept that building the side-chain
> and
> >> let B-factor sky rocket might reflect reality more so than not building
> it?
> >>
> >
> > NO.  Your B-factors are valid within a harmonic (small) approximation of
> > atomic vibrations. Larger scale motions you are talking about go beyond
> the
> > harmonic approximation, and using the B-factor to model them is abusing
> the
> > corresponding mathematical model.
> > http://www.phenix-online.org/newsletter/CCN_2010_07.pdf
> >
> > Pavel.
> >
>
> Mark J van Raaij
> Laboratorio M-4
> Dpto de Estructura de Macromoléculas
> Centro Nacional de Biotecnología - CSIC
>
> c/Darwin 3, Campus Cantoblanco
> 28049 Madrid
> tel. 91 585 4616
> email: mjvanra...@cnb.csic.es
>
>

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread VAN RAAIJ , MARK JOHAN

yet, apart from (and additionally to) modelling two conformations of the 
side-chain, the B-factor is the only tool we have (now). 

Quoting Pavel Afonine:

> Hi  Quyen,
>
>
> (...) And if B-factor is an estimate of thermo-motion (or static disorder),
>> then would it not be reasonable to accept that building the side-chain and
>> let B-factor sky rocket might reflect reality more so than not building it?
>>
>
> NO.  Your B-factors are valid within a harmonic (small) approximation of
> atomic vibrations. Larger scale motions you are talking about go beyond the
> harmonic approximation, and using the B-factor to model them is abusing the
> corresponding mathematical model.
> http://www.phenix-online.org/newsletter/CCN_2010_07.pdf
>
> Pavel.
>

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoléculas
Centro Nacional de Biotecnología - CSIC
c/Darwin 3, Campus Cantoblanco
28049 Madrid
tel. 91 585 4616
email: mjvanra...@cnb.csic.es

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Ethan Merritt

On Wednesday, March 30, 2011 11:04:30 am James Holton wrote:
> perhaps a better name for the "disordered side chain problem" would be 
> "dark density"?  This name would place it properly amongst "dark 
> matter", "dark energy" and other fudge factors introduced to try and 
> explain why our "standard model" is not consistent with observation?  

Funny you should mention that.
I have a partial answer to the problem - Stay Tuned!

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Pavel Afonine

Hi  Quyen,


(...) And if B-factor is an estimate of thermo-motion (or static disorder),
> then would it not be reasonable to accept that building the side-chain and
> let B-factor sky rocket might reflect reality more so than not building it?
>

NO.  Your B-factors are valid within a harmonic (small) approximation of
atomic vibrations. Larger scale motions you are talking about go beyond the
harmonic approximation, and using the B-factor to model them is abusing the
corresponding mathematical model.
http://www.phenix-online.org/newsletter/CCN_2010_07.pdf

Pavel.

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Mark J van Raaij

 in the past, 
>>> and maybe it's time for a powerful dictator at the PDB to create the law...
>>> 
>>> Filip Van Petegem
>>> 
>>> 
>>> 
>>> On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij  
>>> wrote:
>>> perhaps the IUCr and/or PDB (Gerard K?) should issue some guidelines along 
>>> these lines?
>>> And oblige us all to follow them?
>>> Mark J van Raaij
>>> Laboratorio M-4
>>> Dpto de Estructura de Macromoleculas
>>> Centro Nacional de Biotecnologia - CSIC
>>> c/Darwin 3, Campus Cantoblanco
>>> E-28049 Madrid, Spain
>>> tel. (+34) 91 585 4616
>>> http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1
>>> 
>>> 
>>> 
>>> On 30 Mar 2011, at 17:29, Phoebe Rice wrote:
>>> 
>>> > I've now polled 4 fairly savvy "end users" of crystal structures and 
>>> > there seems to be a consensus:
>>> >
>>> > - they all know what B is and how to look for regions of high B (with, 
>>> > say, pymol) and they know not to make firm conclusions about H-bonds to 
>>> > flaming red side chains.
>>> > - None of them would ever think to look at occupancy and they don't know 
>>> > how anyway.
>>> > - they expect that loops with disordered backbones would not be included 
>>> > in the models, and can figure out truncated or fake-ala side chains with 
>>> > some additioanl effort, but that option makes viewing surfaces and 
>>> > e-stats more of a pain.
>>> >
>>> >  Phoebe
>>> >
>>> > =
>>> > Phoebe A. Rice
>>> > Dept. of Biochemistry & Molecular Biology
>>> > The University of Chicago
>>> > phone 773 834 1723
>>> > http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
>>> > http://www.rsc.org/shop/books/2008/9780854042722.asp
>>> >
>>> >
>>> >  Original message 
>>> >> Date: Tue, 29 Mar 2011 17:43:49 -0400
>>> >> From: CCP4 bulletin board  (on behalf of Ed 
>>> >> Pozharski )
>>> >> Subject: [ccp4bb] what to do with disordered side chains
>>> >> To: CCP4BB@JISCMAIL.AC.UK
>>> >>
>>> >> The results of the online survey on what to do with disordered side
>>> >> chains (from total of 240 responses):
>>> >>
>>> >> Delete the atoms 43%
>>> >> Let refinement take care of it by inflating B-factors41%
>>> >> Set occupancy to zero12%
>>> >> Other 4%
>>> >>
>>> >> "Other" suggestions were:
>>> >>
>>> >> - Place atoms in most likely spot based on rotomer and contacts and
>>> >> indicate high positional sigmas on ATMSIG records
>>> >> - To invent refinement that will spread this residues over many rotamers
>>> >> as this is what actually happened
>>> >> - Delet the atoms but retain the original amino acid name
>>> >> - choose the most common rotamer (B-factors don't "inflate", they just
>>> >> rise slightly)
>>> >> - Depends. if the disordered region is unteresting, delete atoms.
>>> >> Otherwise, try to model it in one or more disordered model (and then
>>> >> state it clearly in the pdb file)
>>> >> - In case that no density is in the map, model several conformations of
>>> >> the missing segment and insert it into the PDB file with zero
>>> >> occupancies. It is equivalent what the NMR people do.
>>> >> - Model it in and compare the MD simulations with SAXS
>>> >> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>>> >> - Let the refinement inflate B-factors, then set occupancy to zero in
>>> >> the last round.
>>> >>
>>> >> Thanks to all for participation,
>>> >>
>>> >> Ed.
>>> >>
>>> >> --
>>> >> "I'd jump in myself, if I weren't so good at whistling."
>>> >>  Julian, King of Lemurs
>>> 
>>> 
>>> 
>>> -- 
>>> Filip Van Petegem, PhD
>>> Assistant Professor
>>> The University of British Columbia
>>> Dept. of Biochemistry and Molecular Biology
>>> 2350 Health Sciences Mall - Rm 2.356
>>> Vancouver, V6T 1Z3
>>> 
>>> phone: +1 604 827 4267
>>> email: filip.vanpete...@gmail.com
>>> http://crg.ubc.ca/VanPetegem/
>> 
>

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Dale Tronrud

>> there are no absolute guidelines simply because there isn't any
>>> consensus among crystallographers... (from what we can gather from
>>> this set of emails...). On the other hand, this discussion has flared
>>> up many times in the past, and maybe it's time for a powerful
>>> dictator at the PDB to create the law...
>>>
>>> Filip Van Petegem
>>>
>>>
>>>
>>> On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij
>>> mailto:mjvanra...@cnb.csic.es>> wrote:
>>>
>>> perhaps the IUCr and/or PDB (Gerard K?) should issue some
>>> guidelines along these lines?
>>> And oblige us all to follow them?
>>> Mark J van Raaij
>>> Laboratorio M-4
>>> Dpto de Estructura de Macromoleculas
>>> Centro Nacional de Biotecnologia - CSIC
>>> c/Darwin 3, Campus Cantoblanco
>>> E-28049 Madrid, Spain
>>> tel. (+34) 91 585 4616 
>>> 
>>> http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1
>>>
>>>
>>>
>>> On 30 Mar 2011, at 17:29, Phoebe Rice wrote:
>>>
>>> > I've now polled 4 fairly savvy "end users" of crystal
>>> structures and there seems to be a consensus:
>>> >
>>> > - they all know what B is and how to look for regions of high B
>>> (with, say, pymol) and they know not to make firm conclusions
>>> about H-bonds to flaming red side chains.
>>> > - None of them would ever think to look at occupancy and they
>>>     don't know how anyway.
>>>     > - they expect that loops with disordered backbones would not be
>>> included in the models, and can figure out truncated or fake-ala
>>> side chains with some additioanl effort, but that option makes
>>> viewing surfaces and e-stats more of a pain.
>>> >
>>> >  Phoebe
>>> >
>>> > =
>>> > Phoebe A. Rice
>>> > Dept. of Biochemistry & Molecular Biology
>>> > The University of Chicago
>>> > phone 773 834 1723 
>>> >
>>> 
>>> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
>>> > http://www.rsc.org/shop/books/2008/9780854042722.asp
>>> >
>>> >
>>> >  Original message 
>>> >> Date: Tue, 29 Mar 2011 17:43:49 -0400
>>> >> From: CCP4 bulletin board >> <mailto:CCP4BB@JISCMAIL.AC.UK>> (on behalf of Ed Pozharski
>>> mailto:epozh...@umaryland.edu>>)
>>> >> Subject: [ccp4bb] what to do with disordered side chains
>>> >> To: CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK>
>>> >>
>>> >> The results of the online survey on what to do with disordered
>>> side
>>> >> chains (from total of 240 responses):
>>> >>
>>> >> Delete the atoms 43%
>>> >> Let refinement take care of it by inflating B-factors41%
>>> >> Set occupancy to zero12%
>>> >> Other 4%
>>> >>
>>> >> "Other" suggestions were:
>>> >>
>>> >> - Place atoms in most likely spot based on rotomer and
>>> contacts and
>>> >> indicate high positional sigmas on ATMSIG records
>>> >> - To invent refinement that will spread this residues over
>>> many rotamers
>>> >> as this is what actually happened
>>> >> - Delet the atoms but retain the original amino acid name
>>> >> - choose the most common rotamer (B-factors don't "inflate",
>>> they just
>>> >> rise slightly)
>>> >> - Depends. if the disordered region is unteresting, delete atoms.
>>> >> Otherwise, try to model it in one or more disordered model
>>> (and then
>>> >> state it clearly in the pdb file)
>>> >> - In case that no density is in the map, model several
>>> conformations of
>>> >> the missing segment and insert it into the PDB file with zero
>>> >> occupancies. It is equivalent what the NMR people do.
>>> >> - Model it in and compare the MD simulations with SAXS
>>> >> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>>> >> - Let the refinement inflate B-factors, then set occupancy to
>>> zero in
>>> >> the last round.
>>> >>
>>> >> Thanks to all for participation,
>>> >>
>>> >> Ed.
>>> >>
>>> >> --
>>> >> "I'd jump in myself, if I weren't so good at whistling."
>>> >>  Julian, King of Lemurs
>>>
>>>
>>>
>>>
>>> -- 
>>> Filip Van Petegem, PhD
>>> Assistant Professor
>>> The University of British Columbia
>>> Dept. of Biochemistry and Molecular Biology
>>> 2350 Health Sciences Mall - Rm 2.356
>>> Vancouver, V6T 1Z3
>>>
>>> phone: +1 604 827 4267
>>> email: filip.vanpete...@gmail.com <mailto:filip.vanpete...@gmail.com>
>>> http://crg.ubc.ca/VanPetegem/
>>

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Quyen Hoang

ancy and they  
don't know how anyway.
> - they expect that loops with disordered backbones would not be  
included in the models, and can figure out truncated or fake-ala  
side chains with some additioanl effort, but that option makes  
viewing surfaces and e-stats more of a pain.

>
>  Phoebe
>
> =
> Phoebe A. Rice
> Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723
> 
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> http://www.rsc.org/shop/books/2008/9780854042722.asp
>
>
>  Original message 
>> Date: Tue, 29 Mar 2011 17:43:49 -0400
>> From: CCP4 bulletin board  (on behalf of  
Ed Pozharski )

>> Subject: [ccp4bb] what to do with disordered side chains
>> To: CCP4BB@JISCMAIL.AC.UK
>>
>> The results of the online survey on what to do with disordered  
side

>> chains (from total of 240 responses):
>>
>> Delete the atoms 43%
>> Let refinement take care of it by inflating B-factors41%
>> Set occupancy to zero12%
>> Other 4%
>>
>> "Other" suggestions were:
>>
>> - Place atoms in most likely spot based on rotomer and contacts  
and

>> indicate high positional sigmas on ATMSIG records
>> - To invent refinement that will spread this residues over many  
rotamers

>> as this is what actually happened
>> - Delet the atoms but retain the original amino acid name
>> - choose the most common rotamer (B-factors don't "inflate",  
they just

>> rise slightly)
>> - Depends. if the disordered region is unteresting, delete atoms.
>> Otherwise, try to model it in one or more disordered model (and  
then

>> state it clearly in the pdb file)
>> - In case that no density is in the map, model several  
conformations of

>> the missing segment and insert it into the PDB file with zero
>> occupancies. It is equivalent what the NMR people do.
>> - Model it in and compare the MD simulations with SAXS
>> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>> - Let the refinement inflate B-factors, then set occupancy to  
zero in

>> the last round.
>>
>> Thanks to all for participation,
>>
>> Ed.
>>
>> --
>> "I'd jump in myself, if I weren't so good at whistling."
>>  Julian, King of Lemurs



--
Filip Van Petegem, PhD
Assistant Professor
The University of British Columbia
Dept. of Biochemistry and Molecular Biology
2350 Health Sciences Mall - Rm 2.356
Vancouver, V6T 1Z3

phone: +1 604 827 4267
email: filip.vanpete...@gmail.com
http://crg.ubc.ca/VanPetegem/

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread James Holton

haps the IUCr and/or PDB (Gerard K?) should issue some guidelines along 
these lines?
And oblige us all to follow them?
Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3, Campus Cantoblanco
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1



On 30 Mar 2011, at 17:29, Phoebe Rice wrote:


I've now polled 4 fairly savvy "end users" of crystal structures and there 
seems to be a consensus:

- they all know what B is and how to look for regions of high B (with, say, 
pymol) and they know not to make firm conclusions about H-bonds to flaming red 
side chains.
- None of them would ever think to look at occupancy and they don't know how 
anyway.
- they expect that loops with disordered backbones would not be included in the 
models, and can figure out truncated or fake-ala side chains with some 
additioanl effort, but that option makes viewing surfaces and e-stats more of a 
pain.

  Phoebe

=
Phoebe A. Rice
Dept. of Biochemistry&  Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 

Date: Tue, 29 Mar 2011 17:43:49 -0400
From: CCP4 bulletin board  (on behalf of Ed 
Pozharski)
Subject: [ccp4bb] what to do with disordered side chains
To: CCP4BB@JISCMAIL.AC.UK

The results of the online survey on what to do with disordered side
chains (from total of 240 responses):

Delete the atoms 43%
Let refinement take care of it by inflating B-factors41%
Set occupancy to zero12%
Other 4%

"Other" suggestions were:

- Place atoms in most likely spot based on rotomer and contacts and
indicate high positional sigmas on ATMSIG records
- To invent refinement that will spread this residues over many rotamers
as this is what actually happened
- Delet the atoms but retain the original amino acid name
- choose the most common rotamer (B-factors don't "inflate", they just
rise slightly)
- Depends. if the disordered region is unteresting, delete atoms.
Otherwise, try to model it in one or more disordered model (and then
state it clearly in the pdb file)
- In case that no density is in the map, model several conformations of
the missing segment and insert it into the PDB file with zero
occupancies. It is equivalent what the NMR people do.
- Model it in and compare the MD simulations with SAXS
- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
- Let the refinement inflate B-factors, then set occupancy to zero in
the last round.

Thanks to all for participation,

Ed.

--
"I'd jump in myself, if I weren't so good at whistling."
  Julian, King of Lemurs



--
Filip Van Petegem, PhD
Assistant Professor
The University of British Columbia
Dept. of Biochemistry and Molecular Biology
2350 Health Sciences Mall - Rm 2.356
Vancouver, V6T 1Z3

phone: +1 604 827 4267
email: filip.vanpete...@gmail.com
http://crg.ubc.ca/VanPetegem/

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Frank von Delft

ons
about H-bonds to flaming red side chains.
> - None of them would ever think to look at occupancy and they
don't know how anyway.
> - they expect that loops with disordered backbones would not be
included in the models, and can figure out truncated or fake-ala
side chains with some additioanl effort, but that option makes
viewing surfaces and e-stats more of a pain.
>
>  Phoebe
>
> =
> Phoebe A. Rice
> Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723 
>

http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
    > http://www.rsc.org/shop/books/2008/9780854042722.asp
>
>
>  Original message 
>> Date: Tue, 29 Mar 2011 17:43:49 -0400
>> From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> (on behalf of Ed Pozharski
mailto:epozh...@umaryland.edu>>)
>> Subject: [ccp4bb] what to do with disordered side chains
>> To: CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK>
>>
>> The results of the online survey on what to do with disordered
side
>> chains (from total of 240 responses):
>>
>> Delete the atoms 43%
>> Let refinement take care of it by inflating B-factors41%
>> Set occupancy to zero12%
>> Other 4%
>>
>> "Other" suggestions were:
>>
>> - Place atoms in most likely spot based on rotomer and
contacts and
>> indicate high positional sigmas on ATMSIG records
>> - To invent refinement that will spread this residues over
many rotamers
>> as this is what actually happened
>> - Delet the atoms but retain the original amino acid name
>> - choose the most common rotamer (B-factors don't "inflate",
they just
>> rise slightly)
>> - Depends. if the disordered region is unteresting, delete atoms.
>> Otherwise, try to model it in one or more disordered model
(and then
>> state it clearly in the pdb file)
>> - In case that no density is in the map, model several
conformations of
>> the missing segment and insert it into the PDB file with zero
>> occupancies. It is equivalent what the NMR people do.
>> - Model it in and compare the MD simulations with SAXS
>> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>> - Let the refinement inflate B-factors, then set occupancy to
zero in
>> the last round.
>>
>> Thanks to all for participation,
>>
>> Ed.
>>
>> --
>> "I'd jump in myself, if I weren't so good at whistling."
>>  Julian, King of Lemurs




--
Filip Van Petegem, PhD
Assistant Professor
The University of British Columbia
Dept. of Biochemistry and Molecular Biology
2350 Health Sciences Mall - Rm 2.356
Vancouver, V6T 1Z3

phone: +1 604 827 4267
email: filip.vanpete...@gmail.com <mailto:filip.vanpete...@gmail.com>
http://crg.ubc.ca/VanPetegem/

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Mark J van Raaij

w...
>> 
>> Filip Van Petegem
>> 
>> 
>> 
>> On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij  
>> wrote:
>> perhaps the IUCr and/or PDB (Gerard K?) should issue some guidelines along 
>> these lines?
>> And oblige us all to follow them?
>> Mark J van Raaij
>> Laboratorio M-4
>> Dpto de Estructura de Macromoleculas
>> Centro Nacional de Biotecnologia - CSIC
>> c/Darwin 3, Campus Cantoblanco
>> E-28049 Madrid, Spain
>> tel. (+34) 91 585 4616
>> http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1
>> 
>> 
>> 
>> On 30 Mar 2011, at 17:29, Phoebe Rice wrote:
>> 
>> > I've now polled 4 fairly savvy "end users" of crystal structures and there 
>> > seems to be a consensus:
>> >
>> > - they all know what B is and how to look for regions of high B (with, 
>> > say, pymol) and they know not to make firm conclusions about H-bonds to 
>> > flaming red side chains.
>> > - None of them would ever think to look at occupancy and they don't know 
>> > how anyway.
>> > - they expect that loops with disordered backbones would not be included 
>> > in the models, and can figure out truncated or fake-ala side chains with 
>> > some additioanl effort, but that option makes viewing surfaces and e-stats 
>> > more of a pain.
>> >
>> >  Phoebe
>> >
>> > =
>> > Phoebe A. Rice
>> > Dept. of Biochemistry & Molecular Biology
>> > The University of Chicago
>> > phone 773 834 1723
>> > http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
>> > http://www.rsc.org/shop/books/2008/9780854042722.asp
>> >
>> >
>> >  Original message 
>> >> Date: Tue, 29 Mar 2011 17:43:49 -0400
>> >> From: CCP4 bulletin board  (on behalf of Ed 
>> >> Pozharski )
>> >> Subject: [ccp4bb] what to do with disordered side chains
>> >> To: CCP4BB@JISCMAIL.AC.UK
>> >>
>> >> The results of the online survey on what to do with disordered side
>> >> chains (from total of 240 responses):
>> >>
>> >> Delete the atoms 43%
>> >> Let refinement take care of it by inflating B-factors41%
>> >> Set occupancy to zero12%
>> >> Other 4%
>> >>
>> >> "Other" suggestions were:
>> >>
>> >> - Place atoms in most likely spot based on rotomer and contacts and
>> >> indicate high positional sigmas on ATMSIG records
>> >> - To invent refinement that will spread this residues over many rotamers
>> >> as this is what actually happened
>> >> - Delet the atoms but retain the original amino acid name
>> >> - choose the most common rotamer (B-factors don't "inflate", they just
>> >> rise slightly)
>> >> - Depends. if the disordered region is unteresting, delete atoms.
>> >> Otherwise, try to model it in one or more disordered model (and then
>> >> state it clearly in the pdb file)
>> >> - In case that no density is in the map, model several conformations of
>> >> the missing segment and insert it into the PDB file with zero
>> >> occupancies. It is equivalent what the NMR people do.
>> >> - Model it in and compare the MD simulations with SAXS
>> >> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>> >> - Let the refinement inflate B-factors, then set occupancy to zero in
>> >> the last round.
>> >>
>> >> Thanks to all for participation,
>> >>
>> >> Ed.
>> >>
>> >> --
>> >> "I'd jump in myself, if I weren't so good at whistling."
>> >>  Julian, King of Lemurs
>> 
>> 
>> 
>> -- 
>> Filip Van Petegem, PhD
>> Assistant Professor
>> The University of British Columbia
>> Dept. of Biochemistry and Molecular Biology
>> 2350 Health Sciences Mall - Rm 2.356
>> Vancouver, V6T 1Z3
>> 
>> phone: +1 604 827 4267
>> email: filip.vanpete...@gmail.com
>> http://crg.ubc.ca/VanPetegem/
>

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread James Holton

I'm afraid this is not a problem that can be solved by "standardization".

Fundamentally, if you are a scientist who has collected some data (be it 
diffraction spot intensities, cell counts, or substrate concentration vs 
time), and you have built a "model" to explain that data (be it a 
constellation of atoms in a unit cell, exponential population growth, or 
a microscopic reaction mechanism), I think it is generally expected that 
your model explain the data "to within experimental error".  
Unfortunately, this is never the case in macromolecular crystallography, 
where the model-data disagreement (Fobs-Fcalc) is ~4-5x bigger than the 
"error bars" (sigma(F)).

Now, there is nothing shameful about an incomplete model, especially 
when thousands of very intelligent people working over half a century 
have not been able to come up with a better way to build one.  In fact, 
perhaps a better name for the "disordered side chain problem" would be 
"dark density"?  This name would place it properly amongst "dark 
matter", "dark energy" and other fudge factors introduced to try and 
explain why our "standard model" is not consistent with observation?  
That is, "dark density" is the stuff we can't see, but nonetheless must 
be there somewhere.

Whatever it is, I personally do hold a vain belief that perhaps someday 
soon the problem of "dark density" will be solved, and that presently 
instituting a "policy" requiring that all macromolecular models from 
this day forward remain at least as incomplete as yesterday's models is 
not a very good idea.  I say: if you think there is "something there" 
then you should build it in, especially if it is important to the 
conclusions you are trying to make.  You can defend your model the same 
way you would defend any other scientific model: by using established 
statistics to show that it agrees with the data better than an 
"alternative model" (like leaving it out).  It is YOUR model, after 
all!  Only you are responsible for how "right" it is.

I do appreciate that students and other novices may have a harder time 
defining "surfaces" and measuring hydrogen bond lengths in these pesky 
"floppy regions", but perhaps their education would be served better by 
learning the truth sooner than later?

-James Holton
MAD Scientist

On 3/30/2011 9:26 AM, Filip Van Petegem wrote:

Hello Mark,

I absolutely agree with this.  The worst thing is when everybody is 
following their own personal rules, and there are no major guidelines 
for end-users to figure out how to interpret those parts.  I assume 
there are no absolute guidelines simply because there isn't any 
consensus among crystallographers... (from what we can gather from 
this set of emails...). On the other hand, this discussion has flared 
up many times in the past, and maybe it's time for a powerful dictator 
at the PDB to create the law...

Filip Van Petegem

On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij 
mailto:mjvanra...@cnb.csic.es>> wrote:

perhaps the IUCr and/or PDB (Gerard K?) should issue some
guidelines along these lines?
And oblige us all to follow them?
Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3, Campus Cantoblanco
E-28049 Madrid, Spain
tel. (+34) 91 585 4616 
http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1

On 30 Mar 2011, at 17:29, Phoebe Rice wrote:

> I've now polled 4 fairly savvy "end users" of crystal structures
and there seems to be a consensus:
>
> - they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions
about H-bonds to flaming red side chains.
> - None of them would ever think to look at occupancy and they
don't know how anyway.
> - they expect that loops with disordered backbones would not be
included in the models, and can figure out truncated or fake-ala
side chains with some additioanl effort, but that option makes
viewing surfaces and e-stats more of a pain.
>
>  Phoebe
>
> =
> Phoebe A. Rice
> Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723 
>

http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> http://www.rsc.org/shop/books/2008/9780854042722.asp
>
    >
>  Original message 
>> Date: Tue, 29 Mar 2011 17:43:49 -0400
>> From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> (on behalf of Ed Pozharski
mailto:epoz

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Ethan Merritt

The University of Chicago
> phone 773 834 1723
> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> http://www.rsc.org/shop/books/2008/9780854042722.asp
> 
> 
> ---- Original message ----
> >Date: Tue, 29 Mar 2011 17:43:49 -0400
> >From: CCP4 bulletin board  (on behalf of Ed Pozharski 
> >)
> >Subject: [ccp4bb] what to do with disordered side chains  
> >To: CCP4BB@JISCMAIL.AC.UK
> >
> >The results of the online survey on what to do with disordered side
> >chains (from total of 240 responses):
> >
> >Delete the atoms 43%
> >Let refinement take care of it by inflating B-factors41%
> >Set occupancy to zero12%
> >Other 4%
> >
> >"Other" suggestions were:
> >
> >- Place atoms in most likely spot based on rotomer and contacts and
> >indicate high positional sigmas on ATMSIG records
> >- To invent refinement that will spread this residues over many rotamers
> >as this is what actually happened
> >- Delet the atoms but retain the original amino acid name
> >- choose the most common rotamer (B-factors don't "inflate", they just
> >rise slightly)
> >- Depends. if the disordered region is unteresting, delete atoms.
> >Otherwise, try to model it in one or more disordered model (and then
> >state it clearly in the pdb file)
> >- In case that no density is in the map, model several conformations of
> >the missing segment and insert it into the PDB file with zero
> >occupancies. It is equivalent what the NMR people do. 
> >- Model it in and compare the MD simulations with SAXS
> >- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
> >- Let the refinement inflate B-factors, then set occupancy to zero in
> >the last round.
> >
> >Thanks to all for participation,
> >
> >Ed.
> >
> 

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Filip Van Petegem

Hello Mark,

I absolutely agree with this.  The worst thing is when everybody is
following their own personal rules, and there are no major guidelines for
end-users to figure out how to interpret those parts.  I assume there are no
absolute guidelines simply because there isn't any consensus among
crystallographers... (from what we can gather from this set of emails...).
On the other hand, this discussion has flared up many times in the past, and
maybe it's time for a powerful dictator at the PDB to create the law...

Filip Van Petegem



On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij wrote:

> perhaps the IUCr and/or PDB (Gerard K?) should issue some guidelines along
> these lines?
> And oblige us all to follow them?
> Mark J van Raaij
> Laboratorio M-4
> Dpto de Estructura de Macromoleculas
> Centro Nacional de Biotecnologia - CSIC
> c/Darwin 3, Campus Cantoblanco
> E-28049 Madrid, Spain
> tel. (+34) 91 585 4616
>
> http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1
>
>
>
> On 30 Mar 2011, at 17:29, Phoebe Rice wrote:
>
> > I've now polled 4 fairly savvy "end users" of crystal structures and
> there seems to be a consensus:
> >
> > - they all know what B is and how to look for regions of high B (with,
> say, pymol) and they know not to make firm conclusions about H-bonds to
> flaming red side chains.
> > - None of them would ever think to look at occupancy and they don't know
> how anyway.
> > - they expect that loops with disordered backbones would not be included
> in the models, and can figure out truncated or fake-ala side chains with
> some additioanl effort, but that option makes viewing surfaces and e-stats
> more of a pain.
> >
> >  Phoebe
> >
> > =
> > Phoebe A. Rice
> > Dept. of Biochemistry & Molecular Biology
> > The University of Chicago
> > phone 773 834 1723
> >
> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> > http://www.rsc.org/shop/books/2008/9780854042722.asp
> >
> >
> >  Original message 
> >> Date: Tue, 29 Mar 2011 17:43:49 -0400
> >> From: CCP4 bulletin board  (on behalf of Ed
> Pozharski )
> >> Subject: [ccp4bb] what to do with disordered side chains
> >> To: CCP4BB@JISCMAIL.AC.UK
> >>
> >> The results of the online survey on what to do with disordered side
> >> chains (from total of 240 responses):
> >>
> >> Delete the atoms 43%
> >> Let refinement take care of it by inflating B-factors41%
> >> Set occupancy to zero12%
> >> Other 4%
> >>
> >> "Other" suggestions were:
> >>
> >> - Place atoms in most likely spot based on rotomer and contacts and
> >> indicate high positional sigmas on ATMSIG records
> >> - To invent refinement that will spread this residues over many rotamers
> >> as this is what actually happened
> >> - Delet the atoms but retain the original amino acid name
> >> - choose the most common rotamer (B-factors don't "inflate", they just
> >> rise slightly)
> >> - Depends. if the disordered region is unteresting, delete atoms.
> >> Otherwise, try to model it in one or more disordered model (and then
> >> state it clearly in the pdb file)
> >> - In case that no density is in the map, model several conformations of
> >> the missing segment and insert it into the PDB file with zero
> >> occupancies. It is equivalent what the NMR people do.
> >> - Model it in and compare the MD simulations with SAXS
> >> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
> >> - Let the refinement inflate B-factors, then set occupancy to zero in
> >> the last round.
> >>
> >> Thanks to all for participation,
> >>
> >> Ed.
> >>
> >> --
> >> "I'd jump in myself, if I weren't so good at whistling."
> >>  Julian, King of Lemurs
>



-- 
Filip Van Petegem, PhD
Assistant Professor
The University of British Columbia
Dept. of Biochemistry and Molecular Biology
2350 Health Sciences Mall - Rm 2.356
Vancouver, V6T 1Z3

phone: +1 604 827 4267
email: filip.vanpete...@gmail.com
http://crg.ubc.ca/VanPetegem/

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Jacob Keller

What about setting both the occupancy to 0 *and* setting the b-factors
to some special arbitrary number, say, 500? Then people would pick up
easily on the side chains being dubious, and the refinement would not
be affected by them.

Jacob



On Wed, Mar 30, 2011 at 10:29 AM, Phoebe Rice  wrote:
> I've now polled 4 fairly savvy "end users" of crystal structures and there 
> seems to be a consensus:
>
> - they all know what B is and how to look for regions of high B (with, say, 
> pymol) and they know not to make firm conclusions about H-bonds to flaming 
> red side chains.
> - None of them would ever think to look at occupancy and they don't know how 
> anyway.
> - they expect that loops with disordered backbones would not be included in 
> the models, and can figure out truncated or fake-ala side chains with some 
> additioanl effort, but that option makes viewing surfaces and e-stats more of 
> a pain.
>
>  Phoebe
>
> =
> Phoebe A. Rice
> Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723
> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> http://www.rsc.org/shop/books/2008/9780854042722.asp
>
>
>  Original message 
>>Date: Tue, 29 Mar 2011 17:43:49 -0400
>>From: CCP4 bulletin board  (on behalf of Ed Pozharski 
>>)
>>Subject: [ccp4bb] what to do with disordered side chains
>>To: CCP4BB@JISCMAIL.AC.UK
>>
>>The results of the online survey on what to do with disordered side
>>chains (from total of 240 responses):
>>
>>Delete the atoms                                         43%
>>Let refinement take care of it by inflating B-factors    41%
>>Set occupancy to zero                                    12%
>>Other                                                     4%
>>
>>"Other" suggestions were:
>>
>>- Place atoms in most likely spot based on rotomer and contacts and
>>indicate high positional sigmas on ATMSIG records
>>- To invent refinement that will spread this residues over many rotamers
>>as this is what actually happened
>>- Delet the atoms but retain the original amino acid name
>>- choose the most common rotamer (B-factors don't "inflate", they just
>>rise slightly)
>>- Depends. if the disordered region is unteresting, delete atoms.
>>Otherwise, try to model it in one or more disordered model (and then
>>state it clearly in the pdb file)
>>- In case that no density is in the map, model several conformations of
>>the missing segment and insert it into the PDB file with zero
>>occupancies. It is equivalent what the NMR people do.
>>- Model it in and compare the MD simulations with SAXS
>>- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>>- Let the refinement inflate B-factors, then set occupancy to zero in
>>the last round.
>>
>>Thanks to all for participation,
>>
>>Ed.
>>
>>--
>>"I'd jump in myself, if I weren't so good at whistling."
>>                               Julian, King of Lemurs
>



-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Mark J van Raaij

perhaps the IUCr and/or PDB (Gerard K?) should issue some guidelines along 
these lines?
And oblige us all to follow them?
Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3, Campus Cantoblanco
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1



On 30 Mar 2011, at 17:29, Phoebe Rice wrote:

> I've now polled 4 fairly savvy "end users" of crystal structures and there 
> seems to be a consensus:
> 
> - they all know what B is and how to look for regions of high B (with, say, 
> pymol) and they know not to make firm conclusions about H-bonds to flaming 
> red side chains.
> - None of them would ever think to look at occupancy and they don't know how 
> anyway.
> - they expect that loops with disordered backbones would not be included in 
> the models, and can figure out truncated or fake-ala side chains with some 
> additioanl effort, but that option makes viewing surfaces and e-stats more of 
> a pain.
> 
>  Phoebe
> 
> =
> Phoebe A. Rice
> Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723
> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> http://www.rsc.org/shop/books/2008/9780854042722.asp
> 
> 
>  Original message 
>> Date: Tue, 29 Mar 2011 17:43:49 -0400
>> From: CCP4 bulletin board  (on behalf of Ed Pozharski 
>> )
>> Subject: [ccp4bb] what to do with disordered side chains  
>> To: CCP4BB@JISCMAIL.AC.UK
>> 
>> The results of the online survey on what to do with disordered side
>> chains (from total of 240 responses):
>> 
>> Delete the atoms 43%
>> Let refinement take care of it by inflating B-factors41%
>> Set occupancy to zero12%
>> Other 4%
>> 
>> "Other" suggestions were:
>> 
>> - Place atoms in most likely spot based on rotomer and contacts and
>> indicate high positional sigmas on ATMSIG records
>> - To invent refinement that will spread this residues over many rotamers
>> as this is what actually happened
>> - Delet the atoms but retain the original amino acid name
>> - choose the most common rotamer (B-factors don't "inflate", they just
>> rise slightly)
>> - Depends. if the disordered region is unteresting, delete atoms.
>> Otherwise, try to model it in one or more disordered model (and then
>> state it clearly in the pdb file)
>> - In case that no density is in the map, model several conformations of
>> the missing segment and insert it into the PDB file with zero
>> occupancies. It is equivalent what the NMR people do. 
>> - Model it in and compare the MD simulations with SAXS
>> - I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>> - Let the refinement inflate B-factors, then set occupancy to zero in
>> the last round.
>> 
>> Thanks to all for participation,
>> 
>> Ed.
>> 
>> -- 
>> "I'd jump in myself, if I weren't so good at whistling."
>>  Julian, King of Lemurs

Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Phoebe Rice

I've now polled 4 fairly savvy "end users" of crystal structures and there 
seems to be a consensus:

- they all know what B is and how to look for regions of high B (with, say, 
pymol) and they know not to make firm conclusions about H-bonds to flaming red 
side chains.
- None of them would ever think to look at occupancy and they don't know how 
anyway.
- they expect that loops with disordered backbones would not be included in the 
models, and can figure out truncated or fake-ala side chains with some 
additioanl effort, but that option makes viewing surfaces and e-stats more of a 
pain.

  Phoebe

=
Phoebe A. Rice
Dept. of Biochemistry & Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 
>Date: Tue, 29 Mar 2011 17:43:49 -0400
>From: CCP4 bulletin board  (on behalf of Ed Pozharski 
>)
>Subject: [ccp4bb] what to do with disordered side chains  
>To: CCP4BB@JISCMAIL.AC.UK
>
>The results of the online survey on what to do with disordered side
>chains (from total of 240 responses):
>
>Delete the atoms 43%
>Let refinement take care of it by inflating B-factors41%
>Set occupancy to zero12%
>Other 4%
>
>"Other" suggestions were:
>
>- Place atoms in most likely spot based on rotomer and contacts and
>indicate high positional sigmas on ATMSIG records
>- To invent refinement that will spread this residues over many rotamers
>as this is what actually happened
>- Delet the atoms but retain the original amino acid name
>- choose the most common rotamer (B-factors don't "inflate", they just
>rise slightly)
>- Depends. if the disordered region is unteresting, delete atoms.
>Otherwise, try to model it in one or more disordered model (and then
>state it clearly in the pdb file)
>- In case that no density is in the map, model several conformations of
>the missing segment and insert it into the PDB file with zero
>occupancies. It is equivalent what the NMR people do. 
>- Model it in and compare the MD simulations with SAXS
>- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
>- Let the refinement inflate B-factors, then set occupancy to zero in
>the last round.
>
>Thanks to all for participation,
>
>Ed.
>
>-- 
>"I'd jump in myself, if I weren't so good at whistling."
>   Julian, King of Lemurs

[ccp4bb] what to do with disordered side chains

2011-03-29 Thread Ed Pozharski

The results of the online survey on what to do with disordered side
chains (from total of 240 responses):

Delete the atoms 43%
Let refinement take care of it by inflating B-factors41%
Set occupancy to zero12%
Other 4%

"Other" suggestions were:

- Place atoms in most likely spot based on rotomer and contacts and
indicate high positional sigmas on ATMSIG records
- To invent refinement that will spread this residues over many rotamers
as this is what actually happened
- Delet the atoms but retain the original amino acid name
- choose the most common rotamer (B-factors don't "inflate", they just
rise slightly)
- Depends. if the disordered region is unteresting, delete atoms.
Otherwise, try to model it in one or more disordered model (and then
state it clearly in the pdb file)
- In case that no density is in the map, model several conformations of
the missing segment and insert it into the PDB file with zero
occupancies. It is equivalent what the NMR people do. 
- Model it in and compare the MD simulations with SAXS
- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
- Let the refinement inflate B-factors, then set occupancy to zero in
the last round.

Thanks to all for participation,

Ed.

-- 
"I'd jump in myself, if I weren't so good at whistling."
   Julian, King of Lemurs

84 matches

Mail list logo