subject:"\[ccp4bb\] Deposition of riding H"

Re: [ccp4bb] Deposition of riding H

2012-05-14 Thread Ed Pozharski

On Sat, 2012-05-12 at 19:28 +0100, Yuri Pompeu wrote:
 Dear community,
 I am probably disturbing a sleeping bear

definitely so

 Reading the thread on hydrogen deposition with the model, I came accross 
 several arguments that make sense on their own, but when put together are 
 puzzling and dont seem to converge to an answer.

this is true for most other recurring threads

 -Some argued that depositing riding hydrogens with the model may imply that 
 your data had enough information for you to include hydrogen atoms in the 
 final model.

Unless there is a remark that explicitly states that these are riding
hydrogens, but nobody reads those

 This is definetely a problem especially when dealing with non-experienced 
 users that may think the model is more accurate than it really is.

they always do and they have no idea how accurate the model actually is

 -It seemed to be consensus, that when softwares use hydrogen restraints it 
 can be beneficial geometrically and also can make your model a better 
 description of your x-ray data. 

I think nobody disputes that, although the benefit may vary from
structure to structure

 Based on these two main arguments, many would agree that hydrogens should be 
 included throughout refinement but not deposited.

I do agree and I won't deposit them myself, but then what others choose
to deposit is really their choice.

 So this brings me to last point that was also mentioned in the old thread. If 
 you used riding hydrogens throughout refinement and arrived at a final model 
 that you believe best describes your x-ray data to a certain level of 
 accuracy (Rvalues, geometry, map CC, etc...) would you not be invalidating 
 the whole refinement process by going in and removing the hydrogen atoms 
 right before deposition?

Not really.  You report that you used riding hydrogens and you report
the program you used to generate them.  In theory, anyone can dig up the
appropriate version and reproduce your results.

 So how would one avoid this Catch-22?

I don't think it's strictly a catch-22 situation.  The issue is that
depending on what the structural model is used for, different forms of
the pdb file may become most useful.  The only situation I can imagine
when having riding hydrogens is beneficial is for algorithm development
and perhaps verification of how much differences in riding hydrogen
treatment contribute to differences in things like R-values.  Both are
quite esoteric tasks and you already provide sufficient information
(vide supra).

Cheers,

Ed.

-- 
Hurry up before we all come back to our senses!
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H

2012-05-12 Thread Ethan Merritt

On Saturday, 12 May 2012, Yuri Pompeu wrote:
 If you used riding hydrogens throughout refinement and arrived at a final 
 model that you believe best describes your x-ray data to a certain level of 
 accuracy (Rvalues, geometry, map CC, etc...) would you not be invalidating 
 the whole refinement process by going in and removing the hydrogen atoms 
 right before deposition?

My view:

You are not removing hydrogen atoms at all. You are stating that the model
being deposited includes riding hydrodens.  The consumer of your model can
regenerate the individual hydrogen coordinates from that information if
needed, just as refmac does when you start a new refinement cycle with the
riding hydrogen model selected.  You don't need to output the individual 
hydrogen coordinates between cycles, or at deposition time, because they
are adequately described by the riding hydrogen model.

You might as well ask why do we remove all copies of the molecules in
the crystal except for those in a single asymmetric unit?
They are not really removed; they are implicit in the statement of
the crystallographic symmetry.

Ethan

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-20 Thread Quyen Hoang


Hi Nicholas,

Thank you for your reply.



snip

it seems that we are trying to deposit one model to satisfy two
different purposes - one for model validation and the other for model
interpretation (use in docking etc), and what's good for one purpose
might not be necessarily good for the other.

/snip

This has been discussed before on this list, but allow me to repeat  
it:

You would have expected that the crystallographers' aim would be to
deposit the model that maximises the product (likelihood * prior).
Clearly, this is not what we do, mainly because (a) the calculation of
likelihood is only based on a subset of the 'data' that are obtained  
from
an X-ray diffraction experiment (for example, we ignore diffuse  
scattering
as Ian pointed-out), (b) we consciously avoid 'prior' because this  
would
make the models 'subjective', meaning that better informed people  
would
deposit (for the same data) different models than the less well  
informed,

(c) the format of the PDB does not offer much room for 'creative
interpretations' of the electron density maps [for example, you  
can't have
discrete disorder on the backbone (or has this changed ?)]. I sense  
that
what is being deposited is not the 'best model' in any conceivable  
way,
but the model that 'best' accounts for the final 2mFo-DFc map within  
the

limitations of the program used for the final refinement.


I don't quite understand your point. We currently deposit electron  
densities and movies, I don't see how depositing an energy minimized  
structure is so difficult. It doesn't need to be on the same pdb file  
as the model used in refinement nor does it need to be deposited into  
the PDB server, but even if it does, is it not possible to have it as  
a new Chain or new atom type in the current pdb file format?




ps. May I say parenthetically that making the deposited models  
dependant

on their intended usage, would possibly qualify as 'fraud' ;-)


I don't quite understand this either. When I prepare a protein model  
for simulation, I would remove all alternative conformations, add  
hydrogens, and then minimize the structure. If I make such a minimized  
structure available for others to use with full disclosure, how would  
that constitute fraud? I was going to start offering minimized  
models on our future structures on our lab website, but if that  
constitutes fraud, then I might have to rethink.


I don't know enough to argue with anyone here and that's not the  
intention of my posts - I am just trying to help figure out a way to  
resolve a significant problem that will likely to resurface down the  
road. It would be helpful if the more experienced people here can  
start a discussion of 'how to resolve' the problems exposed by this  
thread so far - assuming that you agree that it's a problem worth your  
time.


Cheers,
Quyen

__
Quyen Hoang, Ph.D
Assistant Professor
Department of Biochemistry and Molecular Biology,
Stark Neurosciences Research Institute
Indiana University School of Medicine
635 Barnhill Drive, Room MS0013D
Indianapolis, Indiana 46202-5122

Phone: 317-274-4371
Fax: 317-274-4686
email: qqho...@iupui.edu



--


 Dr Nicholas M. Glykos, Department of Molecular Biology
and Genetics, Democritus University of Thrace, University Campus,
 Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office)  
+302551030620,

   Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-19 Thread Nicholas M Glykos

Hi Ethan,

  mainly because (a) the calculation of likelihood is only based on a 
  subset of the 'data' that are obtained from an X-ray diffraction 
  experiment (for example, we ignore diffuse scattering as Ian 
  pointed-out),
 
 I do not think that is a valid criticism.  In any field of science 
 one might hypothesize that conducting a different kind of experiment
 and fitting it in accordance with a different theory would produce
 a different model.  But that is only a hypothetical;  it does not
 invalidate the analysis of the experiment you did do based on the
 data you did collect.

For the example I mentioned (diffuse scattering), the experiment would be 
identical. Although using only subset of the available information may not 
invalidate the analysis performed, still it is not the best that can be 
done with the data in hand.


  (b) we consciously avoid 'prior' because this would make the models 
  'subjective', meaning that better informed people would deposit (for 
  the same data) different models than the less well informed,
 
 I don't know of anyone who consciously avoids using their prior 
 knowledge to inform their current work.  But yes, people with more 
 experience may in the end deposit better models than people with little 
 experience.  That's why it is valuable to have automated tools like 
 Molprobity to check a proposed model against established prior 
 expectations.  It's also one way this bulletin board is value, because 
 it allows those with less experience to ask advice from those with more 
 experience.

Most people would like to think that the models they deposit correspond to 
an 'objective' representation of the experimentally accessible physical 
reality. The validation tools, mainly by enforcing a uniformity of 
interpretation, discourage (and not encourage) the incorporation in the 
model of prior knowledge about the problem at hand, and thus, offer to 
their users the safety of an 'objectively validated model'.



  (c) the format of the PDB does not offer much room for 'creative 
  interpretations' of the electron density maps [for example, you can't 
  have discrete disorder on the backbone (or has this changed ?)].
 
 Could you expand on this point?  
 I am not aware of any restriction on multiple backbone conformations,
 now or ever.   It is true that our refinement programs have not always
 been very well suited to refine such a model, but that is not a fault
 of the PDB format.

I stand corrected on that. It was probably just me :-)



  I sense that what is being deposited is not the 'best model' in any 
  conceivable way, but the model that 'best' accounts for the final 
  2mFo-DFc map within the limitations of the program used for the final 
  refinement.
 
 That would be true if the refinement is conducted in real space.
 However, it is nearly universal to do the final refinement in
 reciprocal space.

The emphasis of what I said was clearly on model building, and not on the 
refinement methodology. The reference to the refinement program was again 
model-centric (ranging from the treatment of hydrogens, to the bulk 
solvent model used).


Best regards,
Nicholas


-- 


  Dr Nicholas M. Glykos, Department of Molecular Biology
 and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-18 Thread Nicholas M Glykos

snip
 it seems that we are trying to deposit one model to satisfy two 
 different purposes - one for model validation and the other for model 
 interpretation (use in docking etc), and what's good for one purpose 
 might not be necessarily good for the other.
/snip

This has been discussed before on this list, but allow me to repeat it: 
You would have expected that the crystallographers' aim would be to 
deposit the model that maximises the product (likelihood * prior). 
Clearly, this is not what we do, mainly because (a) the calculation of 
likelihood is only based on a subset of the 'data' that are obtained from 
an X-ray diffraction experiment (for example, we ignore diffuse scattering 
as Ian pointed-out), (b) we consciously avoid 'prior' because this would 
make the models 'subjective', meaning that better informed people would 
deposit (for the same data) different models than the less well informed, 
(c) the format of the PDB does not offer much room for 'creative 
interpretations' of the electron density maps [for example, you can't have 
discrete disorder on the backbone (or has this changed ?)]. I sense that 
what is being deposited is not the 'best model' in any conceivable way, 
but the model that 'best' accounts for the final 2mFo-DFc map within the 
limitations of the program used for the final refinement.

My twocents,
Nicholas

ps. May I say parenthetically that making the deposited models dependant 
on their intended usage, would possibly qualify as 'fraud' ;-)


-- 


  Dr Nicholas M. Glykos, Department of Molecular Biology
 and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-18 Thread Ethan Merritt

On Saturday 18 September 2010, Nicholas M Glykos wrote:
 snip
  it seems that we are trying to deposit one model to satisfy two 
  different purposes - one for model validation and the other for model 
  interpretation (use in docking etc), and what's good for one purpose 
  might not be necessarily good for the other.
 /snip
 
 This has been discussed before on this list, but allow me to repeat it: 
 You would have expected that the crystallographers' aim would be to 
 deposit the model that maximises the product (likelihood * prior). 
 Clearly, this is not what we do, 

I guess I have more faith that we do in fact aim for that.
Our data, programs, models, and insight are imperfect,
but we do our best with what we have.

 mainly because (a) the calculation of 
 likelihood is only based on a subset of the 'data' that are obtained from 
 an X-ray diffraction experiment (for example, we ignore diffuse scattering 
 as Ian pointed-out), 

I do not think that is a valid criticism.  In any field of science 
one might hypothesize that conducting a different kind of experiment
and fitting it in accordance with a different theory would produce
a different model.  But that is only a hypothetical;  it does not
invalidate the analysis of the experiment you did do based on the
data you did collect.

 (b) we consciously avoid 'prior' because this would 
 make the models 'subjective', meaning that better informed people would 
 deposit (for the same data) different models than the less well informed, 

I don't know of anyone who consciously avoids using their prior
knowledge to inform their current work.  But yes, people with more
experience may in the end deposit better models than people with 
little experience.  That's why it is valuable to have automated tools
like Molprobity to check a proposed model against established prior
expectations.  It's also one way this bulletin board is value, because
it allows those with less experience to ask advice from those with
more experience.

 (c) the format of the PDB does not offer much room for 'creative 
 interpretations' of the electron density maps [for example, you can't have 
 discrete disorder on the backbone (or has this changed ?)]. 

Could you expand on this point?  
I am not aware of any restriction on multiple backbone conformations,
now or ever.   It is true that our refinement programs have not always
been very well suited to refine such a model, but that is not a fault
of the PDB format.

 I sense that 
 what is being deposited is not the 'best model' in any conceivable way, 
 but the model that 'best' accounts for the final 2mFo-DFc map within the 
 limitations of the program used for the final refinement.

That would be true if the refinement is conducted in real space.
However, it is nearly universal to do the final refinement in
reciprocal space.

If a maximum likelihood residual is used, the aim is to achieve the
best model in the generally accepted formal sense of being the
the set of model parameter values that provide the most likely explanation
for the observed data.  The priors are imposed as restraints;
the partial residual R_crystallographic(Fo, Fc) encompasses the agreement
with the observed data.

 My twocents,
 Nicholas

And mine in return :-) 
Ethan

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-17 Thread Dirk Kostrewa


 Hi Pavel,


Am 16.09.10 17:56, schrieb Pavel Afonine:

 Hi Dirk,

so, wouldn't be the deposition of the final model's Fcalc, Phic (and 
their weights) along with the final coordinates be the best solution? 
The final Fcalc are our best model and can be used to reproduce the 
final statistics (which would remove the sfcheck annoyance) and to 
reproduce the final electron density maps, and the coordinates can be 
used for what ever purpose they are needed, irrespective of adding 
riding hydrogens or not.


it is a great idea and if you look in PDB deposited structure factors 
there is a number of them (but certainly not the majority) that are 
accompanied by Fcalc. However, a few things to keep in mind:


- Imagine a (not very uncommon, unfortunately) situation when someone 
obtains the final model and Fcalc, and then, right before the PDB 
deposition does a final check in Coot, and moves/removes a few atoms 
(a few waters, or instance) here and there. Or may be does a 
real-space fit of a residue. Or removes H, if present. Or renames a 
ligand by request of PDB staff and accidentally change an atom 
parameter(s). All this in turn will invalidate the R-factors and make 
previously calculated Fcalc inconsistent with such a manipulated model.
So, the bottom-line is: having a model that you can use to reproduce 
the reported statistics is important (for validation and database 
sanity at least, if someones believe that such a minor things wouldn't 
impair the biological interpretation - ultimate goal of protein 
structures).
but this is exactly what one shouldn't do: manipulate the structure 
after the final refinement! And if you manipulate it for a good reason, 
do a last final refinement after that, before depositing coordinates 
and structure factors. Then, there will be no problems, as far as I can see.


Best regards,

Dirk

--

***
Dirk Kostrewa
Gene Center Munich, A5.07
Department of Biochemistry
Ludwig-Maximilians-Universität München
Feodor-Lynen-Str. 25
D-81377 Munich
Germany
Phone:  +49-89-2180-76845
Fax:+49-89-2180-76999
E-mail: kostr...@genzentrum.lmu.de
WWW:www.genzentrum.lmu.de
***

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-17 Thread Pavel Afonine


 Dirk,

- Imagine a (not very uncommon, unfortunately) situation when someone 
obtains the final model and Fcalc, and then, right before the PDB 
deposition does a final check in Coot, and moves/removes a few atoms 
(a few waters, or instance) here and there. Or may be does a 
real-space fit of a residue. Or removes H, if present. Or renames a 
ligand by request of PDB staff and accidentally change an atom 
parameter(s). All this in turn will invalidate the R-factors and make 
previously calculated Fcalc inconsistent with such a manipulated model.
So, the bottom-line is: having a model that you can use to reproduce 
the reported statistics is important (for validation and database 
sanity at least, if someones believe that such a minor things 
wouldn't impair the biological interpretation - ultimate goal of 
protein structures).
but this is exactly what one shouldn't do: manipulate the structure 
after the final refinement! And if you manipulate it for a good 
reason, do a last final refinement after that, before depositing 
coordinates and structure factors. Then, there will be no problems, as 
far as I can see.


I apology if what I wrote doesn't read clearly - this is exactly what 
I'm saying: in this particular reply and across the whole discussion. 
Note, I used the word unfortunately above. Anyway, saying it again: 
What I mentioned is based on my (and not only my - see relevant papers) 
observation running validation tools through the whole PDB and making 
note of such manipulated structure. It is a matter of fact that there 
are some intentionally or unintentionally manipulated models, it is very 
bad, it is unfortunate and obviously I'm strictly against it. I'm 
against it to a such a degree so even didn't bother to write a paper on 
this matter, which I mentioned on this thread already:


J. Appl. Cryst. 2010, 43, 669-67.

Therefore it is important to have a model that you can use to reproduce 
the reported statistics (for validation, at least), although having 
Fcalc around wouldn't hurt.


Sorry again, if I wasn't clear in my previous reply.

All the best!
Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-17 Thread Quyen Hoang

As a relatively inexperienced scientist, I find this discussion  
fascinating.
I wonder if NMR and EM people are also worried about depositing enough  
modeled info to allow back calculation of data.


Regarding the original discussion of whether to deposit riding  
hydrogens used in the refinement, it seems that we are trying to  
deposit one model to satisfy two different purposes - one for model  
validation and the other for model interpretation (use in docking  
etc), and what's good for one purpose might not be necessarily good  
for the other.
I wonder if it would help to deposit two different models; one  
precisely reflects the model used in refinement and the other an  
energy minimized model with predicted hydrogens and alternative  
conformations removed?


Cheers,
Quyen

__
Quyen Hoang, Ph.D
Assistant Professor
Department of Biochemistry and Molecular Biology,
Stark Neurosciences Research Institute
Indiana University School of Medicine
635 Barnhill Drive, Room MS0013D
Indianapolis, Indiana 46202-5122

Phone: 317-274-4371
Fax: 317-274-4686
email: qqho...@iupui.edu


On Sep 17, 2010, at 8:28 AM, Ian Tickle wrote:


Oh, goodness, I see: even here, we would need clear rules what the
calculated structure factors are, which weights are were, which  
bulk solvent

correction was applied ... a maze, too!


Fortunately the X-ray  restraint weights/target values are not an
issue here: varying them changes the refined model parameters of
course, but they do not appear in the structure factor formula, so
don't need to be specified in the mathematical model to obtain the
Fcalcs.  You would of course need to know all the weights  target
values (as well as the SF formula) to reproduce the refinement to get
the deposited model.

But could future programs really re-calculate the same structure  
factors

from the deposited model? Because of the expected development of more
advanced methods and algorithms, I have my doubts ... *sigh*


Yes, if the deposited mathematical model is completely specified in
terms of the SF formula used and the values of *all* the parameters
that go into it, then in principle future versions of software using
more advanced models will be able to reproduce the exact Fcalcs.  This
assumes that the advanced models will use the same 'core' formula but
with additional terms and adjustable parameters, so that the simple
model can be obtained from the advanced one by constraining the extra
parameters to fixed values.  However if the simple model is not
'nested' inside the more advanced model in this way, then no it will
not be possible to reproduce the Fcalcs.

However as I implied, the main issue is that we're rather lax at fully
specifying our models (both formulae  parameters): obviously if in
future you don't have all the information you need to reproduce the
calculation then you have no hope of getting the same Fcalcs!

Cheers

-- Ian

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-17 Thread Kendall Nettles

Very interesting discussion. I wonder if the inexperienced user of PDB really 
exists? I don't know anyone off-hand who would really make use of information 
from hydrogen positions but not understand the issues. Although I hear they 
have been sighted in the Everglades  http://en.wikipedia.org/wiki/Skunk_ape

Kendall

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Dirk Kostrewa


 Dear Ian and contributors to this interesting thread,

(please, scroll down a little bit)

Am 15.09.10 23:34, schrieb Ian Tickle:

I should just like to point out that the main source of the
disagreement here seems to be that people have very different ideas
about what a 'model' is or should be.  Strictly a model is a purely
mathematical construct, in this case it consists of the appropriate
equation for the calculated structure factor and the best-fit values
of the various parameters (scattering factors, atomic positions,
occupancies, B factors, TLS parameters etc.) that appear in it. A
mathematical model is inevitably going to be an imperfect
representation of reality, but hopefully it's the best one we can come
up with, in the sense of best explaining the data without significant
overfitting.

The problem arises because many users of the PDB, and I suspect many
contributors to this BB, particularly non-crystallographers, don't see
it like that, because they view a PDB file as a physical model, i.e.
not as the best fit to the data (assuming that the
non-crystallographers even know what the data are!), but the closest
representation of reality.  The difference between the N-H bond
lengths that Ed referred to illustrates the distinction between the
mathematical and the physical model.  The mathematical model requires
that the bond length is 0.86 Ang because that value gives the best fit
of the assumed spherical scattering factor of H to the deformation
density of the X-H covalent bond.  The physical model requires that it
be 1.00 Ang because that is the internuclear distance found by
spectroscopic methods  predicted by QM calculations.  The same goes
for B factors and TLS: to a large extent they are a mathematical
construct whose purpose is to provide an optimal fit to the data.  The
connection of Bs  TLS with reality is tenuous at best, nevertheless
people obviously would like to have a physical interpretation such as
rigid-body correlated motion.  The fact that Bragg scattering provides
no information about correlated motion (you need to measure the
diffuse scattering for that) doesn't seem to deter them!

I have no doubt in my mind that it is the mathematical model that
should be published, because hopefully it's the best available
interpretation of the data.  Whether that involves publishing the
riding H atoms explicitly, or alternatively the formulae and
parameters that were used to calculate their positions I don't mind,
as long as I can faithfully reproduce the Fcalcs to check the validity
of the model.  Then users of the PDB are free to *interpret* the
mathematical models as physical models in a appropriate manner (e.g.
by adjusting the bond lengths to H), and crystallographers have the
untainted mathematical models needed to reproduce the Fcalcs.


so, wouldn't be the deposition of the final model's Fcalc, Phic (and 
their weights) along with the final coordinates be the best solution? 
The final Fcalc are our best model and can be used to reproduce the 
final statistics (which would remove the sfcheck annoyance) and to 
reproduce the final electron density maps, and the coordinates can be 
used for what ever purpose they are needed, irrespective of adding 
riding hydrogens or not.


Best regards,

Dirk.

--

***
Dirk Kostrewa
Gene Center Munich, A5.07
Department of Biochemistry
Ludwig-Maximilians-Universität München
Feodor-Lynen-Str. 25
D-81377 Munich
Germany
Phone:  +49-89-2180-76845
Fax:+49-89-2180-76999
E-mail: kostr...@genzentrum.lmu.de
WWW:www.genzentrum.lmu.de
***

Re: [ccp4bb] Deposition of riding H

2010-09-16 Thread Sanishvili, Ruslan

Hi Pavel,
 
Note that in the ultra-high resolution structure of aldose reductase 
http://www.ncbi.nlm.nih.gov/pubmed/15146478
we didn't see all (or most) hydrogens. So, the converse question one could ask 
is why we didn't see all of them? Was it only because of higher B-factors  or 
because some of them were stripped during data collection?
 
Note that in my original message I said they are, in most cases, still 
assumed. Ultra-high resolution structures are exactly what I meant under few 
cases when some of the positions are not assumed, so thanks for pointing that 
out.
 
It's not all or nothing - some hydrogens will be stripped and some won't. But 
since we don't know which ones are gone, depositing the coordinates of all of 
them may be misleading. It can be particularly dangerous for structure-based 
functional interpretations because several publications suggest that active 
sites are one of the first ones to suffer from radiation damage. And aren't the 
functional interpretations the ultimate goal of protein structures?
 
Cheers,
N.



From: Pavel Afonine [mailto:pafon...@lbl.gov]
Sent: Wed 9/15/2010 5:56 PM
To: Sanishvili, Ruslan
Cc: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Deposition of riding H



  Hi Nukri,

thanks for the paper (I haven't read the paper yet), I definitely missed
this one!

Interesting though, if we assume that they get stripped off during data
collection, how you could see so many hydrogen atoms in Fo-Fc residual
maps for Aldose Reductase structure at 0.66A?

 B. Guillot, C. Jelsch, N. Muzet, C. Lecomte, E. Howard, B.
Chevrier, A. Mitschler, A. Podjarny, A. Cousson, R. Sanishvili  A.
Joachimiak (2000). Multipolar refinement of aldose reductase at
subatomic resolution. Acta Cryst. A56, s199.

 E. I. Howard, R. Cachau, A. Mitschler, P. Barth, B. Chevrier, V.
Lamour, A. Joachimiak, R. Sanishvili, M. Van Zandt, D. Moras  A.
Podjarny (2000). Crystallization of Aldose Reductase leading to Single
Wavelength (0.66 Å) and MAD (0.9 Å) subatomic resolution studies. Acta
Cryst. A56, s57.

 A. D. Podjarny, A. Mitschler, I. Hazemann, T. Petrova, F. Ruiz, E.
Howard, C. Darmanin, R. Chung, T. R. Schneider, R. Sanishvili, C.
Schulze-Briesse, T. Tomizaki, M. Van Zandt, M. Oka, A. Joachimiak  O.
El-Kabbani (2005). Inhibitor binding to aldose reductase studied at
subatomic resolution. Acta Cryst. A61, c122.

Pavel.


On 9/15/10 3:34 PM, Sanishvili, Ruslan wrote:
 Hi All,

 I have not read all messages in the trace so my apologies if somebody
 already pointed out what I have to say.

 There is lot of talk about how this or that software treats the riding
 hydrogens. What to do with the fact that however these hydrogens are
 treated in calculations, they are, in most cases, still assumed? Meents
 et al http://scripts.iucr.org/cgi-bin/paper?xh0004 showed that proteins
 are stripped of hydrogens during X-ray data collection. So, IMHO it is a
 good argument against depositing the H coordinates in PDB.
 Cheers,
 N.


 Ruslan Sanishvili (Nukri), Ph.D.

 GM/CA-CAT
 Biosciences Division, ANL
 9700 S. Cass Ave.
 Argonne, IL 60439

 Tel: (630)252-0665
 Fax: (630)252-0667
 rsanishv...@anl.gov

 -Original Message-
 From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
 George M. Sheldrick
 Sent: Wednesday, September 15, 2010 5:14 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Deposition of riding H

 Pavel: In my original email I very carefully gave credit for
 IMPLEMENTING the TLS concept. Of course the ideas and some
 programs had been around long before, but it was the
 IMPLEMENTATION IN REFMAC that resulted in TLS becoming
 widely used. I had actually considered putting it into SHELXL
 but had not done so for two reasons (a) I was too lazy and
 (b) I missed an essential trick that REFMAC introduced, namely
 the combination of TLS with an additive isotropic B-value for
 each atom.

 Dale: You are quite correct that AFIX 137 breaks my argument
 about not depositing (SHELX) hydrogen atoms because they can
 be recalculated with no loss of experimental information.
 However to be fair, if you generate the first .ins file using
 SHELXPRO (the recommended procedure) you will get AFIX 33
 that doesn't have this problem. For Pavel and others unfamiliar
 with SHELXL, AFIX 33 is a pure riding model with a staggered
 methyl group but AFIX 137 assumes local threefold symmetry,
 finds the initial torsion angle by a three-fold averaged fit
 to the difference density and then refines the torsion angle
 in the following cycles. Since this torsion angle is not
 given explicitly in the output files, if AFIX 137 hydrogens
 are not deposited, they cannot be regenerated except by a full
 refinement against the experimental data.

 George

 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582


 On Wed, 15 Sep 2010

Re: [ccp4bb] Deposition of riding H- Are they or are they not? Additional experiments are needed

2010-09-16 Thread Felix Frolow

Well , maybe they are there (hydrogens), maybe they are not (also depends on 
location). They, or something else also boils sometimes.
I also understand from some other publications such as  
doi:10.1107/S090904509002192 (cyclosporine) that hydrogen abstraction is 
irreversible. Is it supported my Mass Spectroscopy post mortem  in the case of 
cyclosporine and aldose reductase?
Just what left from the irradiated crystals - molecules with or without 
hydrogens can be checked in mass-spectrometer.
BTW, part of my early life I practiced small molecule X-ray crystallography, 
which is by definition ultra-high resolution. When we wished to know in critical
cases were hydrogens are and if they are, we exchanged them with deuterium in 
large crystals and performed neutron diffraction.
One major advantage of neutron diffraction over X-ray diffraction is that the 
latter is rather insensitive to the presence of hydrogen (H) in a structure, 
whereas the nuclei 1H and 2H (i.e. Deuterium, D) are strong scatterers for 
neutrons. This means that the position of deuterium in a crystal structure and 
its thermal motions can be determined far more precisely with neutrons 

Dr Felix Frolow   
Professor of Structural Biology and Biotechnology
Department of Molecular Microbiology
and Biotechnology
Tel Aviv University 69978, Israel

Acta Crystallographica F, co-editor

e-mail: mbfro...@post.tau.ac.il
Tel:  ++972-3640-8723
Fax: ++972-3640-9407
Cellular: 0547 459 608

On Sep 16, 2010, at 15:45 , Sanishvili, Ruslan wrote:

 Hi Pavel,
 
 Note that in the ultra-high resolution structure of aldose reductase 
 http://www.ncbi.nlm.nih.gov/pubmed/15146478
 we didn't see all (or most) hydrogens. So, the converse question one could 
 ask is why we didn't see all of them? Was it only because of higher B-factors 
  or because some of them were stripped during data collection?
 
 Note that in my original message I said they are, in most cases, still 
 assumed. Ultra-high resolution structures are exactly what I meant under few 
 cases when some of the positions are not assumed, so thanks for pointing that 
 out.
 
 It's not all or nothing - some hydrogens will be stripped and some won't. But 
 since we don't know which ones are gone, depositing the coordinates of all of 
 them may be misleading. It can be particularly dangerous for structure-based 
 functional interpretations because several publications suggest that active 
 sites are one of the first ones to suffer from radiation damage. And aren't 
 the functional interpretations the ultimate goal of protein structures?
 
 Cheers,
 N.
 
 
 
 From: Pavel Afonine [mailto:pafon...@lbl.gov]
 Sent: Wed 9/15/2010 5:56 PM
 To: Sanishvili, Ruslan
 Cc: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Deposition of riding H
 
 
 
  Hi Nukri,
 
 thanks for the paper (I haven't read the paper yet), I definitely missed
 this one!
 
 Interesting though, if we assume that they get stripped off during data
 collection, how you could see so many hydrogen atoms in Fo-Fc residual
 maps for Aldose Reductase structure at 0.66A?
 
 B. Guillot, C. Jelsch, N. Muzet, C. Lecomte, E. Howard, B.
 Chevrier, A. Mitschler, A. Podjarny, A. Cousson, R. Sanishvili  A.
 Joachimiak (2000). Multipolar refinement of aldose reductase at
 subatomic resolution. Acta Cryst. A56, s199.
 
 E. I. Howard, R. Cachau, A. Mitschler, P. Barth, B. Chevrier, V.
 Lamour, A. Joachimiak, R. Sanishvili, M. Van Zandt, D. Moras  A.
 Podjarny (2000). Crystallization of Aldose Reductase leading to Single
 Wavelength (0.66 Å) and MAD (0.9 Å) subatomic resolution studies. Acta
 Cryst. A56, s57.
 
 A. D. Podjarny, A. Mitschler, I. Hazemann, T. Petrova, F. Ruiz, E.
 Howard, C. Darmanin, R. Chung, T. R. Schneider, R. Sanishvili, C.
 Schulze-Briesse, T. Tomizaki, M. Van Zandt, M. Oka, A. Joachimiak  O.
 El-Kabbani (2005). Inhibitor binding to aldose reductase studied at
 subatomic resolution. Acta Cryst. A61, c122.
 
 Pavel.
 
 
 On 9/15/10 3:34 PM, Sanishvili, Ruslan wrote:
 Hi All,
 
 I have not read all messages in the trace so my apologies if somebody
 already pointed out what I have to say.
 
 There is lot of talk about how this or that software treats the riding
 hydrogens. What to do with the fact that however these hydrogens are
 treated in calculations, they are, in most cases, still assumed? Meents
 et al http://scripts.iucr.org/cgi-bin/paper?xh0004 showed that proteins
 are stripped of hydrogens during X-ray data collection. So, IMHO it is a
 good argument against depositing the H coordinates in PDB.
 Cheers,
 N.
 
 
 Ruslan Sanishvili (Nukri), Ph.D.
 
 GM/CA-CAT
 Biosciences Division, ANL
 9700 S. Cass Ave.
 Argonne, IL 60439
 
 Tel: (630)252-0665
 Fax: (630)252-0667
 rsanishv...@anl.gov
 
 -Original Message-
 From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
 George M. Sheldrick
 Sent: Wednesday, September 15, 2010 5:14 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb

Re: [ccp4bb] Deposition of riding H

2010-09-16 Thread Pavel Afonine


 Hi Nukri,


Note that in the ultra-high resolution structure of aldose reductase 
http://www.ncbi.nlm.nih.gov/pubmed/15146478
we didn't see all (or most) hydrogens. So, the converse question one could ask 
is why we didn't see all of them? Was it only because of higher B-factors  or 
because some of them were stripped during data collection?


yes, we saw ~54% of them - I used to work on this at some point too ( 
Blakeley MP, Ruiz F, Cachau R, Hazemann I, Meilleur F, Mitschler A, 
Ginell S, Afonine P, Ventura ON, Cousido-Siah A, et al. Quantum model of 
catalysis based on a mobile proton revealed by subatomic x-ray and 
neutron diffraction studies of h-aldose reductase. Proc Natl Acad Sci U 
S A.2008;105(6):1844--1848.)


My impression at that point was that we did not see the rest partially 
because the model was not good enough (in terms of seeing fine 
details). What I mean is that improving model from R-factor~10 to R~9% 
resulted in adding ~10%  more visible H atoms. When I then refined the 
model down to ~7% using Interatomic Scatterers model (to account for 
deformation density) the amount of observable H atoms increased from 
published 54% up to ~68% or so (writing from memory). So, 
hypothetically, I guess, if we could refine it down to some lower 
R-factor we then would see even more H atoms (and the rest, if we 
finally don't see them - would probably be those that gone). The 
resolution and B-factors are necessary but not enough to see H atoms - 
the overall noise level is a key too.


All the best!
Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Ethan Merritt

On Thursday 16 September 2010 01:25:12 am Dirk Kostrewa wrote:
 
 so, wouldn't be the deposition of the final model's Fcalc, Phic (and 
 their weights) along with the final coordinates be the best solution? 
 The final Fcalc are our best model and can be used to reproduce the 
 final statistics (which would remove the sfcheck annoyance) and to 
 reproduce the final electron density maps, and the coordinates can be 
 used for what ever purpose they are needed, irrespective of adding 
 riding hydrogens or not.

Now I'm confused.  Isn't that already the recommended, if not required,
practice?

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Pavel Afonine


 Hi Dirk,

so, wouldn't be the deposition of the final model's Fcalc, Phic (and 
their weights) along with the final coordinates be the best solution? 
The final Fcalc are our best model and can be used to reproduce the 
final statistics (which would remove the sfcheck annoyance) and to 
reproduce the final electron density maps, and the coordinates can be 
used for what ever purpose they are needed, irrespective of adding 
riding hydrogens or not.


it is a great idea and if you look in PDB deposited structure factors 
there is a number of them (but certainly not the majority) that are 
accompanied by Fcalc. However, a few things to keep in mind:


- Imagine a (not very uncommon, unfortunately) situation when someone 
obtains the final model and Fcalc, and then, right before the PDB 
deposition does a final check in Coot, and moves/removes a few atoms (a 
few waters, or instance) here and there. Or may be does a real-space fit 
of a residue. Or removes H, if present. Or renames a ligand by request 
of PDB staff and accidentally change an atom parameter(s). All this in 
turn will invalidate the R-factors and make previously calculated Fcalc 
inconsistent with such a manipulated model.
So, the bottom-line is: having a model that you can use to reproduce the 
reported statistics is important (for validation and database sanity at 
least, if someones believe that such a minor things wouldn't impair the 
biological interpretation - ultimate goal of protein structures).


- To reproduce typically the most used electron density maps, such as 
2mFo-DFc and mFo-DFc, you would also need to deposit coefficients m and 
D, or, alternatively, have a program and free-R flags handy to compute m 
and D yourself.


- Requiring Fcalc, you would have to make sure that this is actually the 
total structure factors Fmodel = scales*(Fcalc_atoms + F_bulk_solvent) 
with all other appropriate scales included. Although, this is easy to do 
by computing the R-factor and comparing it with the reported number.


All the best!
Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Dr. Mark Mayer


Ethan wrote


I believe that deposition of Fc Phic FOM should be required.
Certainly it should be the recommended practice.



For the same series of structures I just deposited, which started the 
the riding H discussion, my mtz file had Fc Phic FOM + other data put 
out by Phenix - pavel can elaborate. rcsb stripped almost all of this 
and the processed file has only:


HKL, Flag,  Fc, SigmaF and FOC :{

What's a structural biologist to do?


--

Mark

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Eric Larson


Hi Mark,

I assume you deposited the mtz?  This is what Ethan was referring to - the pdb 
does not do well with maintaining all the relevant columns when submitting the 
mtz file.  However, if you convert your mtz to cif yourself and make sure it 
has all the columns you would like to include and then submit this cif file to 
the pdb, all the information is retained.

Eric  
__

Eric Larson, PhD
Biomolecular Structure Center
Department of Biochemistry
Box 357742
University of Washington
Seattle, WA 98195

On Thu, 16 Sep 2010, Dr. Mark Mayer wrote:


Ethan wrote


I believe that deposition of Fc Phic FOM should be required.
Certainly it should be the recommended practice.



For the same series of structures I just deposited, which started the the 
riding H discussion, my mtz file had Fc Phic FOM + other data put out by 
Phenix - pavel can elaborate. rcsb stripped almost all of this and the 
processed file has only:


HKL, Flag,  Fc, SigmaF and FOC :{

What's a structural biologist to do?


--

Mark

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Ethan Merritt

On Thursday 16 September 2010 09:56:14 am Dr. Mark Mayer wrote:
 Ethan wrote
 
 I believe that deposition of Fc Phic FOM should be required.
 Certainly it should be the recommended practice.
 
 
 For the same series of structures I just deposited, which started the 
 the riding H discussion, my mtz file had Fc Phic FOM + other data put 
 out by Phenix - pavel can elaborate. rcsb stripped almost all of this 
 and the processed file has only:
 
 HKL, Flag,  Fc, SigmaF and FOC :{

Huh?  That's not a cif fragment. What file are you looking at?
In my experience the PDB feeds back to you a cif format structure factor
file with a name like   rcsb054058-sf.cif
Near the top of that file you should find a description of the data
columns. The columns present depend on what you fed it, of course.

loop_
_refln.crystal_id
_refln.wavelength_id
_refln.scale_group_code
_refln.status
_refln.index_h
_refln.index_k
_refln.index_l
_refln.F_meas_au
_refln.F_meas_sigma_au
_refln.intensity_meas
_refln.intensity_sigma
_refln.F_calc
_refln.fom
_refln.phase_meas


Caveat:  
I have never tried to deposit a structure factor file from phenix; 
maybe that triggers some other processing pathway. Does anyone here know?

I would say that the simple, and almost guaranteed to work, procedure
is to do the cif conversion yourself and deposit the cif file.

I noted in another message that the auto-conversion script on
the PDB deposition site has a tendency to lose columns.
That's why it is better to do the conversion yourself.
I can't say that they _never_ lose columns in an uploaded cif file.
I have had that happen, but only once and quite a while ago.


 What's a structural biologist to do?

The empiricist's approach.
Experiment till you find a procedure that works, then stick to it :-)

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Tim Gruene

On Thu, Sep 16, 2010 at 10:19:14AM -0700, Ethan Merritt wrote:
 [...] 
  What's a structural biologist to do?
 
 The empiricist's approach.
 Experiment till you find a procedure that works, then stick to it :-)

... or the social approach: communicate with the person at the PDB responsible
for your deposition. So far that's work great for me (plaudit for the people at
the PDB(e)).

Tim

 
 -- 
 Ethan A Merritt
 Biomolecular Structure Center,  K-428 Health Sciences Bldg
 University of Washington, Seattle 98195-7742

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature

[ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Dr. Mark Mayer


Huh?  That's not a cif fragment. What file are you looking at?
In my experience the PDB feeds back to you a cif format structure factor
file with a name like   rcsb054058-sf.cif
Near the top of that file you should find a description of the data
columns. The columns present depend on what you fed it, of course.



Come on guys - give me a break ... all I posted was just a list of 
the columns in the sf file - here's a cut and paste of what rcsb 
actually generated


rcsb061284-sf.cif

data_r3om0sf
#
_audit.revision_id  1_0
_audit.creation_date  ?
_audit.update_record'Initial release'

loop_
_refln.wavelength_id
_refln.crystal_id
_refln.scale_group_code
_refln.index_h
_refln.index_k
_refln.index_l
_refln.status
_refln.F_meas_au
_refln.F_meas_sigma_au
_refln.fom
1 1 1   008 o 203.06.3  0.99
1 1 1   00   10 o 281.58.7  0.86

Below is mtzdmp of what I actually deposited (as MTZ)


 Col SortMinMaxNum  % Mean Mean   Resolution 
Type Column
 num order   Missing complete  abs.   LowHigh 
label


   1 ASC  0  46  0  100.00 17.7 17.7  31.88   1.40   H  H
   2 NONE 0  72  0  100.00 27.4 27.4  31.88   1.40   H  K
   3 NONE 0  81  0  100.00 30.5 30.5  31.88   1.40   H  L
   4 NONE3.3  2160.3 0  100.00   162.89   162.89  31.88 
1.40   F  FOBS
   5 NONE0.960.0 0  100.00 5.36 5.36  31.88 
1.40   Q  SIGFOBS
   6 NONE0.0 1.0 0  100.00 0.05 0.05  31.88 
1.40   I  R_FREE_FLAGS
   7 NONE0.1  2253.6 0  100.00   157.73   157.73  31.88 
1.40   F  FMODEL
   8 NONE -180.0   180.0 0  100.00 2.6590.13  31.88 
1.40   P  PHIFMODEL
   9 NONE0.0  5823.1 0  100.00   219.29   219.29  31.88 
1.40   F  FCALC
  10 NONE -180.0   180.0 0  100.00 3.2490.09  31.88 
1.40   P  PHIFCALC
  11 NONE0.0 15330.0 0  100.00   141.04   141.04  31.88 
1.40   F  FMASK
  12 NONE -180.0   180.0 0  100.00 4.2990.74  31.88 
1.40   P  PHIFMASK
  13 NONE0.0  6909.4 0  100.0015.4215.42  31.88 
1.40   F  FBULK
  14 NONE -180.0   180.0 0  100.00 4.2990.74  31.88 
1.40   P  PHIFBULK
  15 NONE  0.803   1.199 0  100.001.0041.004  31.88 
1.40   W  FB_CART

  16 NONE  0.001   1.000 0  100.000.8770.877  31.88   1.40   W  FOM
  17 NONE  0.576   0.754 0  100.000.7050.705  31.88 
1.40   W  ALPHA
  18 NONE277.388 0  100.00 5655.391 5655.391  31.88 
1.40   W  BETA



--

Mark

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-16 Thread Ethan Merritt

On Thursday 16 September 2010 10:34:14 am Dr. Mark Mayer wrote:
 Huh?  That's not a cif fragment. What file are you looking at?
 In my experience the PDB feeds back to you a cif format structure factor
 file with a name like   rcsb054058-sf.cif
 Near the top of that file you should find a description of the data
 columns. The columns present depend on what you fed it, of course.
 
 
 Come on guys - give me a break ... all I posted was just a list of 
 the columns in the sf file

I sincerely apologize.  
Believe it or not, I mistook your emoticon for part of a file syntax
that I was not familiar with.

 HKL, Flag,  Fc, SigmaF and FOC :{

I thought that colon + curly bracket was some funky data delimiter.

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Pavel Afonine


 Hello,

a few points to balance the discussion:

- if you refined your structure without H then it's obviously ok to 
deposit it without H;


- if you refined your structure with H, then you should deposit it with 
H, as your refinement software outputs it (if your software uses H but 
removes it automatically for you - then it least it's not your 
responsibility). Any post-refinement manipulation on final refined model 
is bad since it tends to invalidate the reported statistics (R-factors, 
for instance), which is illustrated in this paper (section 3.1.5: J. 
Appl. Cryst. (2010). 43, 669-676). Indeed, let's not add more 
inconsistencies to the database because of a fear that insufficiently 
trained people may misinterpret it.


- removing H, if really needed, is a matter of one trivial command, but 
adding them back the exact same way they were originally is less 
straightforward.


- I agree that the X-H distances used in refinement and in validation 
are slightly different, although I'm not sure how much of difference 
that would make for validation.


Pavel.


On 9/14/10 10:38 PM, Ed Pozharski wrote:

Mark,

On Tue, 2010-09-14 at 13:34 -0400, Dr. Mark Mayer wrote:

Where does the crystallographic community stand
on deposition of coordinates with riding
hydrogens?

Surely community is divided on this.  There could be arguments made both
ways.  Personally, I think that riding hydrogens can be calculated if
necessary using the same algorithms/parameters employed upon refinement.
It is true that different programs may use different parameter sets and
reproducing exactly the same set of riding hydrogens may be difficult
without exact knowledge of which version was used and ability to unearth
that old version of the software.  This may preclude one from getting
exactly the same riding hydrogen positions (how large that difference
would be I honestly don't know).  But really, who cares?  What is the
benefit of knowing exactly where this or that riding hydrogen was?
Maybe there is some benefit of such comparison in method development,
but I would think its rather limited.

I wholeheartedly agree with Ethan (even though that is not strictly what
he said :) that some minor benefit here is completely negated by the
danger of perception that somehow models tell us where hydrogens are.
It is bad enough that, in my estimate, roughly 10% of atomic coordinates
in the PDB are unwarranted as they come from disordered residues with
exact spatial positions unsupported by electron density.  Let's not add
more things that PDB users may over-interpret.

Cheers,

Ed.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Tim Gruene

Dear Pavel,

On Wed, Sep 15, 2010 at 07:57:09AM -0700, Pavel Afonine wrote:
  Hello,

 a few points to balance the discussion:
Your points sound more sound more like a summary than a contribution to the
discussion which might confuse inexperienced readers of this thread, especially
if they did not follow it completely. So here me counter-balance ;-)


 - if you refined your structure without H then it's obviously ok to  
 deposit it without H;
I do not disagree but would like to add that I believe riding atoms usually
improve the refinement even at poor resolution, so one should not refine without
them at the final stage in the first place.


 - if you refined your structure with H, then you should deposit it with  
 H, as your refinement software outputs it (if your software uses H but  
 removes it automatically for you - then it least it's not your  
 responsibility). Any post-refinement manipulation on final refined model  
 is bad since it tends to invalidate the reported statistics (R-factors,  
 for instance), which is illustrated in this paper (section 3.1.5: J.  
 Appl. Cryst. (2010). 43, 669-676). Indeed, let's not add more  
 inconsistencies to the database because of a fear that insufficiently  
 trained people may misinterpret it.
I disagree, because since the H-atom are (usually) in a riding position and used
e.g. for anti-bumping restraints, they should be considered as (software
dependent, as George pointed out) restraints rather than the actual model in
terms of coordinates. 


 - removing H, if really needed, is a matter of one trivial command, but  
 adding them back the exact same way they were originally is less  
 straightforward.
I despise the word 'trivial' and as much as there is a 'useless use of cat'
there is probably also an 'unnecessary use of trivial'.

Cheers, Tim


 - I agree that the X-H distances used in refinement and in validation  
 are slightly different, although I'm not sure how much of difference  
 that would make for validation.

 Pavel.


 On 9/14/10 10:38 PM, Ed Pozharski wrote:
 Mark,

 On Tue, 2010-09-14 at 13:34 -0400, Dr. Mark Mayer wrote:
 Where does the crystallographic community stand
 on deposition of coordinates with riding
 hydrogens?
 Surely community is divided on this.  There could be arguments made both
 ways.  Personally, I think that riding hydrogens can be calculated if
 necessary using the same algorithms/parameters employed upon refinement.
 It is true that different programs may use different parameter sets and
 reproducing exactly the same set of riding hydrogens may be difficult
 without exact knowledge of which version was used and ability to unearth
 that old version of the software.  This may preclude one from getting
 exactly the same riding hydrogen positions (how large that difference
 would be I honestly don't know).  But really, who cares?  What is the
 benefit of knowing exactly where this or that riding hydrogen was?
 Maybe there is some benefit of such comparison in method development,
 but I would think its rather limited.

 I wholeheartedly agree with Ethan (even though that is not strictly what
 he said :) that some minor benefit here is completely negated by the
 danger of perception that somehow models tell us where hydrogens are.
 It is bad enough that, in my estimate, roughly 10% of atomic coordinates
 in the PDB are unwarranted as they come from disordered residues with
 exact spatial positions unsupported by electron density.  Let's not add
 more things that PDB users may over-interpret.

 Cheers,

 Ed.

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Dale Tronrud

   While I am sympathetic to Ethan's and George's arguments, what
is missing in the world as it stands is a section in PDB files that
encode the parameters and rules used to generate the riding hydrogen
atoms for that particular model.  George has his favorite hydrogen
atoms to build, his favorite bond lengths for placing them (and good
arguments for his selections) and one could, I suppose, look them
up in the documentation for Shelxl, but they should be encoded in
the PDB file to allow automatic regeneration of the hydrogen atoms.

   An explicit listing of the rules for generation is particularly
needed since all these matters can, and often are, modified by the
user.  I know that in my refinements I manually move the hydrogen
from one nitrogen to the other in a couple Histidine side chains,
and have created my own rules for hydrogen generation in co-factors.

   CIF tags will have to be agreed upon (and that's always a fun
job) that would allow the description of the details of the various
hydrogen atom generation schemes that are in use, or may be used
in the future.   It would also be handy to have a reference implementation,
available under some forgiving license, that would materialize the
hydrogen atoms given the PDB header information, and would reproduce
the exact model refined, for any of the refinement programs.

   This is a worthwhile goal, but a tall order.  Until this
infrastructure is in place I think the hydrogen atoms have to be
included in the PDB file.  Otherwise it's the same as saying that
I've refined TLS ADP's but not saying what the TLS parameters were
nor listing the atoms in each TLS group.

Dale Tronrud

P.S. George: Do you think hydrogen atoms generated by the HFIX 137
command should be deposited?  They are placed based on the electron
density map with the dihedral angle of the methyl group becoming a
parameter of the model -- a parameter not recorded anywhere other
than in the hydrogen atom locations.


On 09/14/10 12:41, George M. Sheldrick wrote:
 
 Even though SHELXL refinements often involve resolutions of 1.5A or 
 better, I discourage SHELXL users from depositing their hydrogen 
 coordinates. There are three reasons:
 
 1. The C-H, N-H and O-H distances required to give the best fit to 
 the electron density are significantly shorter than those required 
 for molecular modeling and tests on non-bonded interactions (or
 located by neutron diffraction). It is ESSENTIAL to recalculate 
 them hydrogens at longer distances before using MolProbity and other 
 validation software. 
 
 2. There is considerable confusion concerning the names to be assigned
 to the hydrogens. This is not made easier by the application of a
 chirality test to -CH2- groups!
 
 3. O-H hydrogens are particularly difficult to 'see' and the geometrical
 calculation of their positions is often ambiguous. The same applies
 to the protonation states of histidines and carboxylic acids. In 
 addition such hydrogen positions are often disordered.
 
 For refinement I recommend including C-H and N-H but not O-H hydrogens.
 For very high resolution structures this reduces Rfree by 0.5-1.0% and
 clearly improves the model. At all resolutions the antibumping 
 restraints involving hydrogens are useful. 
 
 George
 
 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582
 
 
 On Tue, 14 Sep 2010, Dr. Mark Mayer wrote:
 
 Here's one for the community, which I'll post to both Phenix and CCP4 BBs.

 Where does the crystallographic community stand on deposition of coordinates
 with riding hydrogens?
 Explicit H are required for calculating all atom clash scores with 
 Molprobity,
 and their use frequently gives better geometry (especially at low 
 resolution).
 Phenix uses explicit riding H for refinement, and outputs these in the 
 refined
 PDB. Refmac also uses riding H but does not output H coordinates.

 While depositing a series of structures refined at 1.4 - 2.75 A with Phenix
 got the following email from the RCSB, who asked I resupply coordinates
 without H for two of the structures. Since we can't see H even at 1.4 Å I
 don't understand why an arbitrary cut off of 1.5 Å was chosen, and also why
 explicit H atoms used in refinement and geometry validation should be 
 stripped
 from the file.

 FROM RCSB

 We encourage depositors not to use hydrogens in the final PDB file for
 the low resolution structures ( 1.5 A). Please provide an updated PDB
 file. We request you to use processed PDB file as a starting point for
 making any corrections to the coordinates and/or re-refinement.
 --

 Mark

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Ethan Merritt

On Wednesday 15 September 2010, Pavel Afonine wrote:
 - if you refined your structure with H, then you should deposit it with 
 H, as your refinement software outputs it 

As I see it, refining your structure in the presence of riding hydrogens
is not the same thing as refining hydrogen positions in your structure.
Let's exclude those rare cases of the latter from discussion.

Tim Gruene wrote:
 since the H-atom are (usually) in a riding position and used
 e.g. for anti-bumping restraints, they should be considered as (software
 dependent, as George pointed out) restraints rather than the actual model in
 terms of coordinates.

I agree. The use of a riding hydrogen model is better viewed as a 
refinement restraint than as a refinement of actual hydrogen positions.

Ethan

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Tim Gruene

Dear Dale,

The PDB-format is, as far as I can see, incapable of containing all the
information that you can store in an .ins-file used for shelxl refinement, so
the only way one could recreate the model would be to also deposit the .ins-file
anyhow, which solves the problem of the riding hydrogens altogether (and much
more).

The pdbe.org, for example allows to upload auxiliary files and in my opinion the
uploading of the final .ins-file (not the .res-file!) should be made mandatory
in the case of shelxl refinement.

Since coot has now become utterly convenient even for shelxl refinement, there
is no reason one should not deposit the .ins-file ([flame] and the PDB-file
probably for legacy reasons [/flame]).

Tim

On Wed, Sep 15, 2010 at 09:14:51AM -0700, Dale Tronrud wrote:
While I am sympathetic to Ethan's and George's arguments, what
 is missing in the world as it stands is a section in PDB files that
 encode the parameters and rules used to generate the riding hydrogen
 atoms for that particular model.  George has his favorite hydrogen
 atoms to build, his favorite bond lengths for placing them (and good
 arguments for his selections) and one could, I suppose, look them
 up in the documentation for Shelxl, but they should be encoded in
 the PDB file to allow automatic regeneration of the hydrogen atoms.
 
An explicit listing of the rules for generation is particularly
 needed since all these matters can, and often are, modified by the
 user.  I know that in my refinements I manually move the hydrogen
 from one nitrogen to the other in a couple Histidine side chains,
 and have created my own rules for hydrogen generation in co-factors.
 
CIF tags will have to be agreed upon (and that's always a fun
 job) that would allow the description of the details of the various
 hydrogen atom generation schemes that are in use, or may be used
 in the future.   It would also be handy to have a reference implementation,
 available under some forgiving license, that would materialize the
 hydrogen atoms given the PDB header information, and would reproduce
 the exact model refined, for any of the refinement programs.
 
This is a worthwhile goal, but a tall order.  Until this
 infrastructure is in place I think the hydrogen atoms have to be
 included in the PDB file.  Otherwise it's the same as saying that
 I've refined TLS ADP's but not saying what the TLS parameters were
 nor listing the atoms in each TLS group.
 
 Dale Tronrud
 
 P.S. George: Do you think hydrogen atoms generated by the HFIX 137
 command should be deposited?  They are placed based on the electron
 density map with the dihedral angle of the methyl group becoming a
 parameter of the model -- a parameter not recorded anywhere other
 than in the hydrogen atom locations.
 
 
 On 09/14/10 12:41, George M. Sheldrick wrote:
  
  Even though SHELXL refinements often involve resolutions of 1.5A or 
  better, I discourage SHELXL users from depositing their hydrogen 
  coordinates. There are three reasons:
  
  1. The C-H, N-H and O-H distances required to give the best fit to 
  the electron density are significantly shorter than those required 
  for molecular modeling and tests on non-bonded interactions (or
  located by neutron diffraction). It is ESSENTIAL to recalculate 
  them hydrogens at longer distances before using MolProbity and other 
  validation software. 
  
  2. There is considerable confusion concerning the names to be assigned
  to the hydrogens. This is not made easier by the application of a
  chirality test to -CH2- groups!
  
  3. O-H hydrogens are particularly difficult to 'see' and the geometrical
  calculation of their positions is often ambiguous. The same applies
  to the protonation states of histidines and carboxylic acids. In 
  addition such hydrogen positions are often disordered.
  
  For refinement I recommend including C-H and N-H but not O-H hydrogens.
  For very high resolution structures this reduces Rfree by 0.5-1.0% and
  clearly improves the model. At all resolutions the antibumping 
  restraints involving hydrogens are useful. 
  
  George
  
  Prof. George M. Sheldrick FRS
  Dept. Structural Chemistry,
  University of Goettingen,
  Tammannstr. 4,
  D37077 Goettingen, Germany
  Tel. +49-551-39-3021 or -3068
  Fax. +49-551-39-22582
  
  
  On Tue, 14 Sep 2010, Dr. Mark Mayer wrote:
  
  Here's one for the community, which I'll post to both Phenix and CCP4 BBs.
 
  Where does the crystallographic community stand on deposition of 
  coordinates
  with riding hydrogens?
  Explicit H are required for calculating all atom clash scores with 
  Molprobity,
  and their use frequently gives better geometry (especially at low 
  resolution).
  Phenix uses explicit riding H for refinement, and outputs these in the 
  refined
  PDB. Refmac also uses riding H but does not output H coordinates.
 
  While depositing a series of structures refined at 1.4 - 2.75 A with Phenix
  got the following email from the RCSB, who

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Ed Pozharski

On Wed, 2010-09-15 at 07:57 -0700, Pavel Afonine wrote:
 if you refined your structure with H, then you should deposit it with 
 H

sure.  But the structure is not *refined with hydrogens* when they are
in predicted positions.  Following the same logic one could suggest that
electron density should be deposited, since we can approximate it.  I
think it's useful to limit the information presented in a pdb-file to
what was actually refined + specific instructions on how the refinement
was done.  

 Any post-refinement manipulation on final refined model 
 is bad since it tends to invalidate the reported statistics 
...
 Indeed, let's not add more 
 inconsistencies to the database because of a fear that insufficiently 
 trained people may misinterpret it. 

I wouldn't call it a post-refinement manipulation, as nothing was really
changed (afaiu, in most cases riding hydrogens are placed automatically
by the program and not manipulated by user).  On a digressing point, you
might be underestimating the problem of misinterpretation by
insufficiently trained people.  

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Pavel Afonine


 Hi Tim,


The pdbe.org, for example allows to upload auxiliary files and in my opinion the
uploading of the final .ins-file (not the .res-file!) should be made mandatory
in the case of shelxl refinement.

Since coot has now become utterly convenient even for shelxl refinement, there
is no reason one should not deposit the .ins-file ([flame] and the PDB-file
probably for legacy reasons [/flame]).


I was always wondering but never had a good occasion to ask (my Shelxl 
knowledge is limited and may be outdated so I apology in advance if my 
questions are too dummy; also I realize that I'm asking a non-CCP4 
question on CCP4bb for which I apology again):


- how .ins file encodes the information about NCS groups used in 
refinement (atom selection for NCS groups, restraint weights for 
different groups, etc?


- how .ins file encodes the information about TLS (again, atom 
selections for TLS groups, TLS matrices, etc)? Related, does it have a 
concept of having TLS and other components to the total atomic 
displacement parameter (ADP)?


- If I recall it correctly, to fix (=not refine) a certain parameter 
(say occupancy or B-factor) in Shelxl you need to add a number 10 to it. 
Is it true? IMHO, this might lead to confusion if such a file gets 
deposited to PDB.


All the best!
Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Felix Frolow

Pavel,
Shelxl is working in correct coordinates - fractional...
Many things are easier in fractional coordinates. Are you sure that Phenix does 
not go orthogonal - fractional - orthogonal in internal calculations?
When fixing of parameter is made in fractional coordinates it does not produce 
confusion. Shelxl also make fractional - orthogonal (AKA PDB) which is also
correct. Constrain is not transferred there. BTW Shelxl knows symmetry very 
well and will constrain atoms that occupying symmetry elements.
Shortly Shelxl knows crystallography best.
When you will see number of lines in Shelxl Fortran code ( do not kill Fortran 
to early) you will be surprised. There are not so many of them.
No graphical user interface yet, but COOT is of great help.

Dr Felix Frolow   
Professor of Structural Biology and Biotechnology
Department of Molecular Microbiology
and Biotechnology
Tel Aviv University 69978, Israel

Acta Crystallographica F, co-editor

e-mail: mbfro...@post.tau.ac.il
Tel:  ++972-3640-8723
Fax: ++972-3640-9407
Cellular: 0547 459 608

On Sep 15, 2010, at 19:11 , Pavel Afonine wrote:

 Hi Tim,
 
 The pdbe.org, for example allows to upload auxiliary files and in my opinion 
 the
 uploading of the final .ins-file (not the .res-file!) should be made 
 mandatory
 in the case of shelxl refinement.
 
 Since coot has now become utterly convenient even for shelxl refinement, 
 there
 is no reason one should not deposit the .ins-file ([flame] and the PDB-file
 probably for legacy reasons [/flame]).
 
 I was always wondering but never had a good occasion to ask (my Shelxl 
 knowledge is limited and may be outdated so I apology in advance if my 
 questions are too dummy; also I realize that I'm asking a non-CCP4 question 
 on CCP4bb for which I apology again):
 
 - how .ins file encodes the information about NCS groups used in refinement 
 (atom selection for NCS groups, restraint weights for different groups, etc?
 
 - how .ins file encodes the information about TLS (again, atom selections for 
 TLS groups, TLS matrices, etc)? Related, does it have a concept of having TLS 
 and other components to the total atomic displacement parameter (ADP)?
 
 - If I recall it correctly, to fix (=not refine) a certain parameter (say 
 occupancy or B-factor) in Shelxl you need to add a number 10 to it. Is it 
 true? IMHO, this might lead to confusion if such a file gets deposited to PDB.
 
 All the best!
 Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Thomas Womack

On 15 Sep 2010, at 18:04, Ed Pozharski wrote:

 On Wed, 2010-09-15 at 07:57 -0700, Pavel Afonine wrote:
 if you refined your structure with H, then you should deposit it with 
 H
 
 sure.  But the structure is not *refined with hydrogens* when they are
 in predicted positions.  Following the same logic one could suggest that
 electron density should be deposited, since we can approximate it.

And I notice that a fair number of groups do deposit electron density - at 
least, they deposit PHIC and sometimes even HL coefficients in the sf.cif file. 
 HL coefficients in the sf.cif file can get badly corrupted in the deposition 
process, but they definitely show willing.

 I think it's useful to limit the information presented in a pdb-file to
 what was actually refined + specific instructions on how the refinement
 was done.

I suppose I come to this from a background where every deposition is a fresh 
new test-case for new refinement software; it's only lack of download bandwidth 
and CPU power that makes me not want to start from the images.

I like the idea that what you deposit is the output of a well-defined 
refinement; which means that you need to deposit the instructions for doing the 
refinement, and the model you used as input.  There's a perfectly good PDB 
protocol for multi-MODEL files.  Nobody does such depositions, I think the PDB 
would complain if you tried, and there's the problem of endless regression.

I would be very happy if every PDB deposition with 'METHOD: MOLECULAR 
REPLACEMENT' had an extra MODEL in it containing the input to the molrep tool, 
and some REMARK lines describing how molrep was used; I would not complain if 
this was made compulsory for depositions which nowadays say 'STARTING MODEL: 
NULL'.  26 of the 130 depositions with method MOLECULAR REPLACEMENT this week 
have starting model NULL, as well as seven depositions with method FOURIER 
SYNTHESIS and starting model NULL.

(why do MAD and SAD depositions still have a STARTING MODEL field?)

(while we're on the subject of riding hydrogens, I would invite people to 
admire the conformations of the hydrogens in such places as the C-alpha of 
residues A45 and A57 of deposition 2x5n - it's clearly a software bug rather 
than any mistake on the part of the authors, but nonetheless striking)

Tom

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Pavel Afonine


 Dear Felix,


Shortly Shelxl knows crystallography best.


I have no doubts about this. May questions were, though:


- how .ins file encodes the information about NCS groups used in refinement 
(atom selection for NCS groups, restraint weights for different groups, etc?

- how .ins file encodes the information about TLS (again, atom selections for 
TLS groups, TLS matrices, etc)? Related, does it have a concept of having TLS 
and other components to the total atomic displacement parameter (ADP)?

- If I recall it correctly, to fix (=not refine) a certain parameter (say 
occupancy or B-factor) in Shelxl you need to add a number 10 to it. Is it true? 
IMHO, this might lead to confusion if such a file gets deposited to PDB.


Best,
Pavel.

Re: [ccp4bb] Deposition of riding H + what to deposit in addition to the pdb

2010-09-15 Thread Ed Pozharski

On Wed, 2010-09-15 at 09:14 -0700, Dale Tronrud wrote:
 I know that in my refinements I manually move the hydrogen
 from one nitrogen to the other in a couple Histidine side chains,
 and have created my own rules for hydrogen generation in co-factors.
 

Excellent point.  And I believe in this case you are perfectly justified
to either place a comment about this in the header or indeed deposit
hydrogens.  But I suspect that this is not what happens in most cases
with, say, 2A refinement using refmac.  The program is simply used to
autogenerate the riding hydrogens, thus making the whole thing perfectly
reproducible (with caveats).  It may be seen as misleading when one
deposits these hydrogens and they appear to have the same standing as
other atoms which were actually refined.

On a related issue, I believe it's long overdue policy change that all
the input files, e.g. command scripts/cif-files for ligands/.ins files
etc. should be attached to a PDB deposition.

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Pavel Afonine


 Dear Ed,


Any post-refinement manipulation on final refined model
is bad since it tends to invalidate the reported statistics

...

Indeed, let's not add more
inconsistencies to the database because of a fear that insufficiently
trained people may misinterpret it.

I wouldn't call it a post-refinement manipulation, as nothing was really
changed (afaiu, in most cases riding hydrogens are placed automatically
by the program and not manipulated by user).  On a digressing point, you
might be underestimating the problem of misinterpretation by
insufficiently trained people.

...

   This may preclude one from getting
exactly the same riding hydrogen positions (how large that difference
would be I honestly don't know).  But really, who cares?


I wouldn't dare calling a model manipulation that typically changes the 
R-factor by 0.5 ... ~2% as nothing.   Although, you are may be right - 
who cares?


I think the  misinterpretation by insufficiently trained people 
problem should not be propagated to the database affecting the quality 
of depositing material. This is what I meant.


Pavel.

Re: [ccp4bb] Deposition of riding H + what to deposit in addition to the pdb

2010-09-15 Thread Alastair Fyfe

The pdbe.org, for example allows to upload auxiliary files and in my 
opinion the
uploading of the final .ins-file (not the .res-file!) should be made 
mandatory

in the case of shelxl refinement.

all the input files, e.g. command scripts/cif-files for ligands/.ins 
files etc.

should be attached to a PDB deposition.

Having access to all input files required to reproduce (modulo 
program/library version) the final/published refinement would be most 
helpful.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread George M. Sheldrick

Dear Pavel,

May I suggest that you take a look at the SHELX manual:
http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf
before sending your SHELX questions to CCP4bb? You might 
even find some good ideas for implementing in phenix_refine! 

For example,
if you look up 'non-crystallographic symmetry' in the index 
you will discover that SHELXL applies NCS in the form of
restraints, not constraints, which has the advantage that it
can be applied locally and in combination with all other
restraints and constraints involving the same atoms. However 
you will not find TLS in the index, because the credit for
implementing this very useful concept should be given to 
Martin Winn, Garib and Ethan, long after the current version 
of SHELXL (and its manual) were released in 1997. And because 
SHELXL only reads one instruction file (*.ins) and one 
reflection file (*.hkl) but no other data files or libraries, 
and FORTRAN will always be FORTRAN, the deposition of these 
two files would be sufficient to define the refinement for 
posterity.

Best wishes, George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Wed, 15 Sep 2010, Pavel Afonine wrote:

  Dear Felix,
 
  Shortly Shelxl knows crystallography best.
 
 I have no doubts about this. May questions were, though:
 
   - how .ins file encodes the information about NCS groups used in
   refinement (atom selection for NCS groups, restraint weights for different
   groups, etc?
  
   - how .ins file encodes the information about TLS (again, atom selections
   for TLS groups, TLS matrices, etc)? Related, does it have a concept of
   having TLS and other components to the total atomic displacement parameter
   (ADP)?
  
   - If I recall it correctly, to fix (=not refine) a certain parameter (say
   occupancy or B-factor) in Shelxl you need to add a number 10 to it. Is it
   true? IMHO, this might lead to confusion if such a file gets deposited to
   PDB.
 
 Best,
 Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Pavel Afonine


 Dear George,

a small correction if I may:


However
you will not find TLS in the index, because the credit for
implementing this very useful concept should be given to
Martin Winn, Garib and Ethan, long after the current version
of SHELXL (and its manual) were released in 1997.


Acta Cryst. (*1985*). A41, 426-433
Restrained structure-factor least-squares refinement of protein 
structures using a vector processing computer

I. Haneef, D. S. Moss, M. J. Stanford and N. Borkakoti

*Abstract:* A least-squares refinement program /RESTRAIN/ has been 
developed, which is capable of refining macromolecular structures using 
structure amplitudes, phases from isomorphous replacement or anomalous 
scattering and pseudo-energy restraints. In addition to positional 
parameters and isotropic temperature factors, anisotropic mean-square 
displacements may be refined either as individual atomic *U* tensors or 
as *TLS* tensors applied to groups of atoms. Anharmonic effects may be 
handled by coupling together occupancies to enable the electron density 
of an atomic group to be distributed over more than one subsite. A novel 
way of restraining groups of atoms to be planar has been developed that 
does not require dummy atoms and does not restrain the plane to lie in 
its current orientation.


One can find other, earlier programs, but they are small molecule specific.

Regards,
Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Ed Pozharski

On Wed, 2010-09-15 at 10:50 -0700, Pavel Afonine wrote:
 I wouldn't dare calling a model manipulation that typically changes
 the 
 R-factor by 0.5 ... ~2% as nothing.   Although, you are may be right
 - 
 who cares?

It's not a manipulation because no parameters were manipulated in the
model.  Don't you agree that using the riding model does not add
additional refinable parameters?

But your insistence has awakened my curiosity.  So I looked at hydrogens
as produced by phenix.refine for a 1.8A structure I randomly picked.
Just as George has pointed out, the covalent bonds are too short.  for
instance, when hydrogens are added, the average N-H distance is
1.1(5), but upon refinement the value is down to 0.85998(4).  I
won't even begin discussing the fact that some of these hydrogens added
to K,Y,S etc are placed in positions that are not justified by data (not
in definitely wrong positions either, it's just that there is no
evidence to support a particular torsion angle).  And that it is
unlikely that every histidine in the structure is fully protonated.

Do you see the problem?  I fully understand your desire to be able to
reproduce the R-factors (although I don't necessarily share it), but if
I decide to deposit this model with hydrogens, am I essentially stating
that N-H bond is magically shortened to ~0.86A?  Sure, it is driver's
(PDB user's) responsibility to know the meaning of the red light (riding
hydrogens), but wouldn't depositing riding hydrogens be equivalent to
putting 70 mph sign at the ramp, just because all the cops know that
it's not the actual safe speed?  And then tell the accident victim that
there was a fine print in the rule book?  I think this situation is
particularly problematic given that these days some enter the field the
same way many people (at least so it seems here in Baltimore) get their
driver's licenses, i.e. without ever learning the rules?

Cheers,

Ed.

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Pavel Afonine


 Dear Ed,

On 9/15/10 12:54 PM, Ed Pozharski wrote:

On Wed, 2010-09-15 at 10:50 -0700, Pavel Afonine wrote:

I wouldn't dare calling a model manipulation that typically changes
the
R-factor by 0.5 ... ~2% as nothing.   Although, you are may be right
-
who cares?

It's not a manipulation because no parameters were manipulated in the
model.


I can't agree with this, sorry. A change to a model content (especially 
the one that changes Fcalc) is a model manipulation.


Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Phil Jeffrey


On 9/15/10 3:54 PM, Ed Pozharski wrote:


Don't you agree that using the riding model does not add
additional refinable parameters?

(snip)

instance, when hydrogens are added, the average N-H distance is
1.1(5), but upon refinement the value is down to 0.85998(4).  I


So the riding hydrogen model is imperfect.  At least with phenix.refine 
you can measure it, unlike the default behavior of REFMAC.  (But you can 
tell it to write hydrogens out, I believe).


Obviously this question is not one amenable to a simple answer.  In some 
sense (as per George) riding hydrogens are merely a restraint.  In some 
other sense they are fundamentally a part of the model - they have very 
directional properties via bumping restraints that most certainly alter 
the atomic model for the heavy atoms in a very direct way via collision. 
 Since the nature of these atoms - locationally specific - differs from 
the more amorphous extended atom restraints (CH3E for methyl in CNS 
etc) it could make sense to include them in the model at deposition.


As far as I know we do not delete atoms from the final model that 
contribute to scattering and geometric restraints under any other 
circumstances, except perhaps in the nearly-as-contentious how do I 
model my disordered side-chain case.  Also not amenable to a simple answer.


Both approaches (REFMAC-esque and PHENIX-esque) have their merits.
I doubt I'm the only person here conflicted over what to do about it.
However this thread appears to have reached the point where not much new 
ground is being broken.


Phil Jeffrey
Princeton

Re: [ccp4bb] Deposition of riding H [was: Deposition of riding H]

2010-09-15 Thread Roberto Steiner


Dear Pavel,

Stressing Ethan's point about TLS refinement becoming practical with  
Refmac implementation, Winn et al. (2001) Acta D, 57, 122-133 (I know  
you like references) states:


Derivatives of the residual with respect to the TLS parameters are  
expanded in terms of the derivatives with respect to individual  
anisotropic U values, which in turn are calculated using a fast  
Fourier transform technique. TLS refinement is therefore fast and can  
be used routinely.


Best wishes
Roberto


On 15 Sep 2010, at 19:34, Pavel Afonine wrote:


Dear George,

a small correction if I may:


However
you will not find TLS in the index, because the credit for
implementing this very useful concept should be given to
Martin Winn, Garib and Ethan, long after the current version
of SHELXL (and its manual) were released in 1997.


Acta Cryst. (1985). A41, 426-433
Restrained structure-factor least-squares refinement of protein  
structures using a vector processing computer

I. Haneef, D. S. Moss, M. J. Stanford and N. Borkakoti

Abstract: A least-squares refinement program RESTRAIN has been  
developed, which is capable of refining macromolecular structures  
using structure amplitudes, phases from isomorphousreplacement  
or anomalous scattering and pseudo-energy restraints. In addition to  
positional parameters and isotropic temperature factors, anisotropic  
mean-square displacements may be refined either as individual atomic  
U tensors or as TLS tensors applied to groups of atoms. Anharmonic  
effects may be handled by coupling together occupancies to enable  
the electron density of an atomic group to be distributed over more  
than one subsite. A novel way of restraining groups of atoms to be  
planar has been developed that does not require dummy atoms and does  
not restrain the plane to lie in its current orientation.


One can find other, earlier programs, but they are small molecule  
specific.


Regards,
Pavel.


---
Dr. Roberto Steiner
Randall Division of Cell and Molecular Biophysics
New Hunt's House
King's College London
Guy's Campus
London, SE1 1UL
Phone +44 (0)20-7848-8216
Fax   +44 (0)20-7848-6435
e-mail roberto.stei...@kcl.ac.uk

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Dr. Mark Mayer


However this thread appears to have reached the point where not much new
ground is being broken.


As the person who started this thread I'll second Phil Jeffrey's comment.

I chose to continue my depositions with riding H, and the rcsb agreed 
to accept the coordinates. Its been great hearing the experts weigh 
in on this. I've learned a lot, and clearly there is no consensus. 
As one of the vast majority of crystallographers dependent on all the 
hard work that program developers undertake to support structural 
biology, I'm happy to follow advice given by the developers of the 
various programs I use, and for Phenix the current advice is to 
deposit with riding H.


--
 Mark

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Ed Pozharski

On Wed, 2010-09-15 at 13:13 -0700, Pavel Afonine wrote:
 I can't agree with this, sorry. A change to a model content
 (especially 
 the one that changes Fcalc) is a model manipulation.
 
That is not what I asked.  Do you agree that using the riding model does
not add additional refinable parameters?


-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Ian Tickle

I should just like to point out that the main source of the
disagreement here seems to be that people have very different ideas
about what a 'model' is or should be.  Strictly a model is a purely
mathematical construct, in this case it consists of the appropriate
equation for the calculated structure factor and the best-fit values
of the various parameters (scattering factors, atomic positions,
occupancies, B factors, TLS parameters etc.) that appear in it. A
mathematical model is inevitably going to be an imperfect
representation of reality, but hopefully it's the best one we can come
up with, in the sense of best explaining the data without significant
overfitting.

The problem arises because many users of the PDB, and I suspect many
contributors to this BB, particularly non-crystallographers, don't see
it like that, because they view a PDB file as a physical model, i.e.
not as the best fit to the data (assuming that the
non-crystallographers even know what the data are!), but the closest
representation of reality.  The difference between the N-H bond
lengths that Ed referred to illustrates the distinction between the
mathematical and the physical model.  The mathematical model requires
that the bond length is 0.86 Ang because that value gives the best fit
of the assumed spherical scattering factor of H to the deformation
density of the X-H covalent bond.  The physical model requires that it
be 1.00 Ang because that is the internuclear distance found by
spectroscopic methods  predicted by QM calculations.  The same goes
for B factors and TLS: to a large extent they are a mathematical
construct whose purpose is to provide an optimal fit to the data.  The
connection of Bs  TLS with reality is tenuous at best, nevertheless
people obviously would like to have a physical interpretation such as
rigid-body correlated motion.  The fact that Bragg scattering provides
no information about correlated motion (you need to measure the
diffuse scattering for that) doesn't seem to deter them!

I have no doubt in my mind that it is the mathematical model that
should be published, because hopefully it's the best available
interpretation of the data.  Whether that involves publishing the
riding H atoms explicitly, or alternatively the formulae and
parameters that were used to calculate their positions I don't mind,
as long as I can faithfully reproduce the Fcalcs to check the validity
of the model.  Then users of the PDB are free to *interpret* the
mathematical models as physical models in a appropriate manner (e.g.
by adjusting the bond lengths to H), and crystallographers have the
untainted mathematical models needed to reproduce the Fcalcs.

Cheers

-- Ian

On Wed, Sep 15, 2010 at 9:13 PM, Pavel Afonine pafon...@lbl.gov wrote:
  Dear Ed,

 On 9/15/10 12:54 PM, Ed Pozharski wrote:

 On Wed, 2010-09-15 at 10:50 -0700, Pavel Afonine wrote:

 I wouldn't dare calling a model manipulation that typically changes
 the
 R-factor by 0.5 ... ~2% as nothing.   Although, you are may be right
 -
 who cares?

 It's not a manipulation because no parameters were manipulated in the
 model.

 I can't agree with this, sorry. A change to a model content (especially the
 one that changes Fcalc) is a model manipulation.

 Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Ed Pozharski

On Wed, 2010-09-15 at 16:26 -0400, Phil Jeffrey wrote:
 So the riding hydrogen model is imperfect.  At least with
 phenix.refine 
 you can measure it, unlike the default behavior of REFMAC.  (But you
 can 
 tell it to write hydrogens out, I believe).
 

My impression is that default behavior of phenix.refine is the same - I
had to change parameters to include hydrogens in the output.

Without breaking any new ground, there is really no conflict here.  Is
it a good idea to make a complete model description (including riding
hydrogens, input files, cif-files, special case restraints etc)
available for structures deposited in the PDB?  Absolutely.  But not in
this form, when model is implying that we know the protonation states of
all the atoms and has unreasonable geometry.  For the example that I
provided, the rmsd_bonds for that particular group is 0.14A, certainly
unacceptable.  Maybe one can use different record for these atoms, say
RIDING instead of ATOM.  Thus complete model can be recovered and at
the same time the nature of these items is explicitly stated.  In this
way riding hydrogens are clearly distinguished from those that are
actually refined at ultrahigh resolution.

Cheers,

Ed.

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Pavel Afonine


 Dear Ed,

On 9/15/10 2:47 PM, Ed Pozharski wrote:

On Wed, 2010-09-15 at 16:26 -0400, Phil Jeffrey wrote:

So the riding hydrogen model is imperfect.  At least with
phenix.refine
you can measure it, unlike the default behavior of REFMAC.  (But you
can
tell it to write hydrogens out, I believe).


My impression is that default behavior of phenix.refine is the same - I
had to change parameters to include hydrogens in the output.


No, if your input file contains H atoms, the output file will contain 
them too (in phenix.refine). You don't have to change any parameters for 
this.


Pavel.

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Ed Pozharski

Sure.  But if I start with model that has no hydrogens, they will be
generated but not passed to the output, right.  just like refmac.

On Wed, 2010-09-15 at 14:52 -0700, Pavel Afonine wrote:
 Dear Ed,
 
 On 9/15/10 2:47 PM, Ed Pozharski wrote:
  On Wed, 2010-09-15 at 16:26 -0400, Phil Jeffrey wrote:
  So the riding hydrogen model is imperfect.  At least with
  phenix.refine
  you can measure it, unlike the default behavior of REFMAC.  (But you
  can
  tell it to write hydrogens out, I believe).
 
  My impression is that default behavior of phenix.refine is the same - I
  had to change parameters to include hydrogens in the output.
 
 No, if your input file contains H atoms, the output file will contain 
 them too (in phenix.refine). You don't have to change any parameters for 
 this.
 
 Pavel.
 

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs

Re: [ccp4bb] Deposition of riding H: R-factor is overrated

2010-09-15 Thread Pavel Afonine


 Dear Ed,

no, if you start with model that has no hydrogens, they will not be 
generated internally.


Pavel.

On 9/15/10 2:58 PM, Ed Pozharski wrote:

Sure.  But if I start with model that has no hydrogens, they will be
generated but not passed to the output, right.  just like refmac.

On Wed, 2010-09-15 at 14:52 -0700, Pavel Afonine wrote:

Dear Ed,

On 9/15/10 2:47 PM, Ed Pozharski wrote:

On Wed, 2010-09-15 at 16:26 -0400, Phil Jeffrey wrote:

So the riding hydrogen model is imperfect.  At least with
phenix.refine
you can measure it, unlike the default behavior of REFMAC.  (But you
can
tell it to write hydrogens out, I believe).


My impression is that default behavior of phenix.refine is the same - I
had to change parameters to include hydrogens in the output.

No, if your input file contains H atoms, the output file will contain
them too (in phenix.refine). You don't have to change any parameters for
this.

Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread George M. Sheldrick

Pavel: In my original email I very carefully gave credit for 
IMPLEMENTING the TLS concept. Of course the ideas and some
programs had been around long before, but it was the
IMPLEMENTATION IN REFMAC that resulted in TLS becoming
widely used. I had actually considered putting it into SHELXL
but had not done so for two reasons (a) I was too lazy and
(b) I missed an essential trick that REFMAC introduced, namely 
the combination of TLS with an additive isotropic B-value for
each atom.

Dale: You are quite correct that AFIX 137 breaks my argument 
about not depositing (SHELX) hydrogen atoms because they can
be recalculated with no loss of experimental information.
However to be fair, if you generate the first .ins file using
SHELXPRO (the recommended procedure) you will get AFIX 33
that doesn't have this problem. For Pavel and others unfamiliar
with SHELXL, AFIX 33 is a pure riding model with a staggered
methyl group but AFIX 137 assumes local threefold symmetry,
finds the initial torsion angle by a three-fold averaged fit
to the difference density and then refines the torsion angle
in the following cycles. Since this torsion angle is not 
given explicitly in the output files, if AFIX 137 hydrogens 
are not deposited, they cannot be regenerated except by a full
refinement against the experimental data. 

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Wed, 15 Sep 2010, George M. Sheldrick wrote:

 Dear Pavel,
 
 May I suggest that you take a look at the SHELX manual:
 http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf
 before sending your SHELX questions to CCP4bb? You might 
 even find some good ideas for implementing in phenix_refine! 
 
 For example,
 if you look up 'non-crystallographic symmetry' in the index 
 you will discover that SHELXL applies NCS in the form of
 restraints, not constraints, which has the advantage that it
 can be applied locally and in combination with all other
 restraints and constraints involving the same atoms. However 
 you will not find TLS in the index, because the credit for
 implementing this very useful concept should be given to 
 Martin Winn, Garib and Ethan, long after the current version 
 of SHELXL (and its manual) were released in 1997. And because 
 SHELXL only reads one instruction file (*.ins) and one 
 reflection file (*.hkl) but no other data files or libraries, 
 and FORTRAN will always be FORTRAN, the deposition of these 
 two files would be sufficient to define the refinement for 
 posterity.
 
 Best wishes, George
 
 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582
 
 
 On Wed, 15 Sep 2010, Pavel Afonine wrote:
 
   Dear Felix,
  
   Shortly Shelxl knows crystallography best.
  
  I have no doubts about this. May questions were, though:
  
- how .ins file encodes the information about NCS groups used in
refinement (atom selection for NCS groups, restraint weights for 
different
groups, etc?
   
- how .ins file encodes the information about TLS (again, atom 
selections
for TLS groups, TLS matrices, etc)? Related, does it have a concept of
having TLS and other components to the total atomic displacement 
parameter
(ADP)?
   
- If I recall it correctly, to fix (=not refine) a certain parameter 
(say
occupancy or B-factor) in Shelxl you need to add a number 10 to it. Is 
it
true? IMHO, this might lead to confusion if such a file gets deposited 
to
PDB.
  
  Best,
  Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Sanishvili, Ruslan

Hi All,

I have not read all messages in the trace so my apologies if somebody
already pointed out what I have to say.

There is lot of talk about how this or that software treats the riding
hydrogens. What to do with the fact that however these hydrogens are
treated in calculations, they are, in most cases, still assumed? Meents
et al http://scripts.iucr.org/cgi-bin/paper?xh0004 showed that proteins
are stripped of hydrogens during X-ray data collection. So, IMHO it is a
good argument against depositing the H coordinates in PDB.
Cheers,
N. 


Ruslan Sanishvili (Nukri), Ph.D.

GM/CA-CAT
Biosciences Division, ANL
9700 S. Cass Ave.
Argonne, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
George M. Sheldrick
Sent: Wednesday, September 15, 2010 5:14 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Deposition of riding H

Pavel: In my original email I very carefully gave credit for 
IMPLEMENTING the TLS concept. Of course the ideas and some
programs had been around long before, but it was the
IMPLEMENTATION IN REFMAC that resulted in TLS becoming
widely used. I had actually considered putting it into SHELXL
but had not done so for two reasons (a) I was too lazy and
(b) I missed an essential trick that REFMAC introduced, namely 
the combination of TLS with an additive isotropic B-value for
each atom.

Dale: You are quite correct that AFIX 137 breaks my argument 
about not depositing (SHELX) hydrogen atoms because they can
be recalculated with no loss of experimental information.
However to be fair, if you generate the first .ins file using
SHELXPRO (the recommended procedure) you will get AFIX 33
that doesn't have this problem. For Pavel and others unfamiliar
with SHELXL, AFIX 33 is a pure riding model with a staggered
methyl group but AFIX 137 assumes local threefold symmetry,
finds the initial torsion angle by a three-fold averaged fit
to the difference density and then refines the torsion angle
in the following cycles. Since this torsion angle is not 
given explicitly in the output files, if AFIX 137 hydrogens 
are not deposited, they cannot be regenerated except by a full
refinement against the experimental data. 

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Wed, 15 Sep 2010, George M. Sheldrick wrote:

 Dear Pavel,
 
 May I suggest that you take a look at the SHELX manual:
 http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf
 before sending your SHELX questions to CCP4bb? You might 
 even find some good ideas for implementing in phenix_refine! 
 
 For example,
 if you look up 'non-crystallographic symmetry' in the index 
 you will discover that SHELXL applies NCS in the form of
 restraints, not constraints, which has the advantage that it
 can be applied locally and in combination with all other
 restraints and constraints involving the same atoms. However 
 you will not find TLS in the index, because the credit for
 implementing this very useful concept should be given to 
 Martin Winn, Garib and Ethan, long after the current version 
 of SHELXL (and its manual) were released in 1997. And because 
 SHELXL only reads one instruction file (*.ins) and one 
 reflection file (*.hkl) but no other data files or libraries, 
 and FORTRAN will always be FORTRAN, the deposition of these 
 two files would be sufficient to define the refinement for 
 posterity.
 
 Best wishes, George
 
 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582
 
 
 On Wed, 15 Sep 2010, Pavel Afonine wrote:
 
   Dear Felix,
  
   Shortly Shelxl knows crystallography best.
  
  I have no doubts about this. May questions were, though:
  
- how .ins file encodes the information about NCS groups used in
refinement (atom selection for NCS groups, restraint weights for
different
groups, etc?
   
- how .ins file encodes the information about TLS (again, atom
selections
for TLS groups, TLS matrices, etc)? Related, does it have a
concept of
having TLS and other components to the total atomic displacement
parameter
(ADP)?
   
- If I recall it correctly, to fix (=not refine) a certain
parameter (say
occupancy or B-factor) in Shelxl you need to add a number 10 to
it. Is it
true? IMHO, this might lead to confusion if such a file gets
deposited to
PDB.
  
  Best,
  Pavel.

[ccp4bb] Deposition of riding H

2010-09-14 Thread Dr. Mark Mayer


Here's one for the community, which I'll post to both Phenix and CCP4 BBs.

Where does the crystallographic community stand 
on deposition of coordinates with riding 
hydrogens?
Explicit H are required for calculating all atom 
clash scores with Molprobity, and their use 
frequently gives better geometry (especially at 
low resolution). Phenix uses explicit riding H 
for refinement, and outputs these in the refined 
PDB. Refmac also uses riding H but does not 
output H coordinates.


While depositing a series of structures refined 
at 1.4 - 2.75 A with Phenix  got the following 
email from the RCSB, who asked I resupply 
coordinates without H for two of the structures. 
Since we can't see H even at 1.4 Å I don't 
understand why an arbitrary cut off of 1.5 Å was 
chosen, and also why explicit H atoms used in 
refinement and geometry validation should be 
stripped from the file.


FROM RCSB

We encourage depositors not to use hydrogens in the final PDB file for
the low resolution structures ( 1.5 A). Please provide an updated PDB
file. We request you to use processed PDB file as a starting point for
making any corrections to the coordinates and/or re-refinement.
--

Mark

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread Ethan Merritt

On Tuesday 14 September 2010 10:34:00 am Dr. Mark Mayer wrote:
 Here's one for the community, which I'll post to both Phenix and CCP4 BBs.
 
 Where does the crystallographic community stand 
 on deposition of coordinates with riding 
 hydrogens?

I do not favor depositing riding hydrogen coordinates for
the same reason that I do not like the recent PDB preference 
for depositing ANISOU records for structures that have been
refined with TLS.

In both cases the enumeration of these many thousands of 
parameter values gives the strong, but false, impression that
they have been individually modeled.  They have not.

There are really only a dozen or so parameters in the riding
hydrogen model.  All those coordinates follow directly from
application of this same small set of values.

Similarly, there are really only 20 parameters per TLS group in
your model, no matter how many atoms you applied it to.
There is IMHO no justification for presenting the resulting model
in a form that makes it appear that 6 additional parameters per
atom have been modeled, when in fact that number is either 0 or 1.

Ethan



 Explicit H are required for calculating all atom 
 clash scores with Molprobity, and their use 
 frequently gives better geometry (especially at 
 low resolution). Phenix uses explicit riding H 
 for refinement, and outputs these in the refined 
 PDB. Refmac also uses riding H but does not 
 output H coordinates.
 
 While depositing a series of structures refined 
 at 1.4 - 2.75 A with Phenix  got the following 
 email from the RCSB, who asked I resupply 
 coordinates without H for two of the structures. 
 Since we can't see H even at 1.4 Å I don't 
 understand why an arbitrary cut off of 1.5 Å was 
 chosen, and also why explicit H atoms used in 
 refinement and geometry validation should be 
 stripped from the file.
 
 FROM RCSB
 
 We encourage depositors not to use hydrogens in the final PDB file for
 the low resolution structures ( 1.5 A). Please provide an updated PDB
 file. We request you to use processed PDB file as a starting point for
 making any corrections to the coordinates and/or re-refinement.
 

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread George M. Sheldrick


Even though SHELXL refinements often involve resolutions of 1.5A or 
better, I discourage SHELXL users from depositing their hydrogen 
coordinates. There are three reasons:

1. The C-H, N-H and O-H distances required to give the best fit to 
the electron density are significantly shorter than those required 
for molecular modeling and tests on non-bonded interactions (or
located by neutron diffraction). It is ESSENTIAL to recalculate 
them hydrogens at longer distances before using MolProbity and other 
validation software. 

2. There is considerable confusion concerning the names to be assigned
to the hydrogens. This is not made easier by the application of a
chirality test to -CH2- groups!

3. O-H hydrogens are particularly difficult to 'see' and the geometrical
calculation of their positions is often ambiguous. The same applies
to the protonation states of histidines and carboxylic acids. In 
addition such hydrogen positions are often disordered.

For refinement I recommend including C-H and N-H but not O-H hydrogens.
For very high resolution structures this reduces Rfree by 0.5-1.0% and
clearly improves the model. At all resolutions the antibumping 
restraints involving hydrogens are useful. 

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Tue, 14 Sep 2010, Dr. Mark Mayer wrote:

 Here's one for the community, which I'll post to both Phenix and CCP4 BBs.
 
 Where does the crystallographic community stand on deposition of coordinates
 with riding hydrogens?
 Explicit H are required for calculating all atom clash scores with Molprobity,
 and their use frequently gives better geometry (especially at low resolution).
 Phenix uses explicit riding H for refinement, and outputs these in the refined
 PDB. Refmac also uses riding H but does not output H coordinates.
 
 While depositing a series of structures refined at 1.4 - 2.75 A with Phenix
 got the following email from the RCSB, who asked I resupply coordinates
 without H for two of the structures. Since we can't see H even at 1.4 Å I
 don't understand why an arbitrary cut off of 1.5 Å was chosen, and also why
 explicit H atoms used in refinement and geometry validation should be stripped
 from the file.
 
 FROM RCSB
 
 We encourage depositors not to use hydrogens in the final PDB file for
 the low resolution structures ( 1.5 A). Please provide an updated PDB
 file. We request you to use processed PDB file as a starting point for
 making any corrections to the coordinates and/or re-refinement.
 --
 
 Mark

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread Pavel Afonine


 Hi Ethan,


I do not favor depositing riding hydrogen coordinates for
the same reason that I do not like the recent PDB preference
for depositing ANISOU records for structures that have been
refined with TLS.

In both cases the enumeration of these many thousands of
parameter values gives the strong, but false, impression that
they have been individually modeled.  They have not.


following this logic one could say that the individual x,y,z coordinates 
listed in ATOM records for a structure refined at very low resolution 
using rigid-body refinement only (or torsion angle Simulated Annealing 
only) also may make a false impression that these coordinates were 
refined individually.


Pavel.

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread Ethan Merritt

On Tuesday 14 September 2010 12:44:37 pm Pavel Afonine wrote:
   Hi Ethan,
 
  I do not favor depositing riding hydrogen coordinates for
  the same reason that I do not like the recent PDB preference
  for depositing ANISOU records for structures that have been
  refined with TLS.
 
  In both cases the enumeration of these many thousands of
  parameter values gives the strong, but false, impression that
  they have been individually modeled.  They have not.
 
 following this logic one could say that the individual x,y,z coordinates 
 listed in ATOM records for a structure refined at very low resolution 
 using rigid-body refinement only (or torsion angle Simulated Annealing 
 only) also may make a false impression that these coordinates were 
 refined individually.

I agree with this, at least for the case of true rigid-body.
But you would still need to describe somehow the coordinates of all the
atoms in your rigid model.  If it came straight out of the PDB, then
in principle it would suffice to give the PDB+CHAIN code and the
rotation/translate matrix.  But if any adjustments were made, which
is I think typical if only to correct for sequence differences,then as a 
practical matter you still need to provide the true starting coordinates.
And at that point you might as well provide the ending coordinates instead,
since it's the same amount of information.

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread Pavel Afonine


 Hi Ethan,

yes, you are absolutely right, you would need to define somehow where 
your model is... But you could, at least hypothetically, use non-atomic 
models for this! Like cylinders (*), spheres and similar shapes. This is 
what the density looks like at those super-low resolutions.


(*)
BOROVIKOV, B. A., VAINSTEIN, B. K., GELFAND, I. M.  KALININ, D. I. 
(1979). Kristallografiya, 24, 227-238.


KALININ, D. I. (1980). Kristallografiya, 25, 535-544.

V. Yu. Lunin, A. G. Urzhumtsev, E. A. Vernoslova, Yu. N. Chirgadze, N. 
A. Neveskaya and N. P. Fomenkova. Acta Cryst. (1985). A41, 166-171.


Anyway, looks like we are about to diverge from the original subject so 
I stop rambling -:)


All the best!
Pavel.

On 9/14/10 1:10 PM, Ethan Merritt wrote:

On Tuesday 14 September 2010 12:44:37 pm Pavel Afonine wrote:

   Hi Ethan,


I do not favor depositing riding hydrogen coordinates for
the same reason that I do not like the recent PDB preference
for depositing ANISOU records for structures that have been
refined with TLS.

In both cases the enumeration of these many thousands of
parameter values gives the strong, but false, impression that
they have been individually modeled.  They have not.

following this logic one could say that the individual x,y,z coordinates
listed in ATOM records for a structure refined at very low resolution
using rigid-body refinement only (or torsion angle Simulated Annealing
only) also may make a false impression that these coordinates were
refined individually.

I agree with this, at least for the case of true rigid-body.
But you would still need to describe somehow the coordinates of all the
atoms in your rigid model.  If it came straight out of the PDB, then
in principle it would suffice to give the PDB+CHAIN code and the
rotation/translate matrix.  But if any adjustments were made, which
is I think typical if only to correct for sequence differences,then as a
practical matter you still need to provide the true starting coordinates.
And at that point you might as well provide the ending coordinates instead,
since it's the same amount of information.

Ethan

Re: [ccp4bb] Deposition of riding H

2010-09-14 Thread Ed Pozharski

Mark,

On Tue, 2010-09-14 at 13:34 -0400, Dr. Mark Mayer wrote:
 Where does the crystallographic community stand 
 on deposition of coordinates with riding 
 hydrogens?

Surely community is divided on this.  There could be arguments made both
ways.  Personally, I think that riding hydrogens can be calculated if
necessary using the same algorithms/parameters employed upon refinement.
It is true that different programs may use different parameter sets and
reproducing exactly the same set of riding hydrogens may be difficult
without exact knowledge of which version was used and ability to unearth
that old version of the software.  This may preclude one from getting
exactly the same riding hydrogen positions (how large that difference
would be I honestly don't know).  But really, who cares?  What is the
benefit of knowing exactly where this or that riding hydrogen was?
Maybe there is some benefit of such comparison in method development,
but I would think its rather limited.

I wholeheartedly agree with Ethan (even though that is not strictly what
he said :) that some minor benefit here is completely negated by the
danger of perception that somehow models tell us where hydrogens are.
It is bad enough that, in my estimate, roughly 10% of atomic coordinates
in the PDB are unwarranted as they come from disordered residues with
exact spatial positions unsupported by electron density.  Let's not add
more things that PDB users may over-interpret.

Cheers,

Ed.

57 matches

Mail list logo