Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-30 Thread Jan Dohnalek
Dear all,
I had this experience: going pedantically to the individual points the RSRZ
and other validation statistics in the form were reporting - in a vast
majority of the cases nothing was wrong at all. So it seems to be somewhat
overdoing its job - not that this is bad on its own - but we are losing
contrast quite a bit between the really serious issues on ... all the other.

Jan Dohnalek




On Wed, Nov 30, 2016 at 2:33 AM, dusan turk <dusan.t...@ijs.si> wrote:

> Guys,
>
> I have a two issues to add here:
>
> 1. RSZS validation does not tolerate chain IDs longer than 1 character, so
> it kills one of the very essential reasons why mmCIF format was introduced
> (to enable deposition of large structures in a single file).
>
> 2. I have noticed in validation report of my own structure (4PIA) that the
> RSZS does not ALWAYS work right. For example, the "PHE 63" is well resolved
> with a hole in the ring, yet the validation declares it as a density
> outlier.  Besides there are several other residues in this structure that
> "sit" well in the density, but are considered outliers, whereas several,
> for which side chains the density is missing, are not listed.
>
> Has anyone else had a similar experience?
>
> Taken all remarks together they suggest that something needs to be done
> with RSZS software or density validation procedure to resolve these issues.
>
> 
>
> best,
> dusan
>
>
> On 30/11/16 01:00, CCP4BB automatic digest system wrote:
>
>> Date:Mon, 28 Nov 2016 20:35:44 -0800
>> From:Pavel Afonine<pafon...@gmail.com>
>> Subject: Re: Calculation of RSRZ Score in PDB Validation Reports
>>
>> I find Lothar's comments regarding H and RSRZ excellent! I would think of
>> it as a pretty much bug report. I hope developers at that end listen. This
>> goes very well in line with Phoebe's comment earlier today.
>>
>> Pavel
>>
>> On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com>
>> wrote:
>>
>> On 11/28/2016 12:52 PM,esse...@helix.nih.gov  wrote:
>>>
>>>> I found that one can get RSRZ to go way down by loosening the geometry
>>>>> restraints.  The result is a crappy structure and I don't recommend
>>>>>
>>>> doing
>>>
>>>> that, but it does get all the atoms crammed into some sort of density.
>>>>>
>>>>Your observation is quite interesting. I can add this: when we were
>>>>
>>> working
>>>
>>>> with low to medium resolution structures, deleting the hydrogen atoms
>>>>
>>> from
>>>
>>>> the model after refinement moved the very bad RSRZ statistic to about
>>>> the
>>>> average in the given resolution range! Note, no re-refinement was done
>>>>
>>> just
>>>
>>>> a simple deletion of the riding H-atoms. I find this to be odd given the
>>>> fact that, say the phenix developers favor the inclusion of H-atoms on
>>>> riding positions even in cases of low resolution structures. (I assume
>>>>
>>> the
>>>
>>>> refmac5 and BUSTER-TNT developers have also a favorable opinion about
>>>> including H-atoms in the final model - and during refinement).
>>>>
>>>> In my mind, it may be tempting to delete H-atoms to improve this
>>>>
>>> statistic but
>>>
>>>> when you use them in refinement they should be included regardless of
>>>> the
>>>> outcome of the RSRZ analysis.
>>>>
>>> Of course, if you trick a validation statistic like this you haven't
>>> accomplished anything.  All you are saying is that one should rank RSRZ
>>> scores with and without hydrogen atoms separately.  Perhaps you should
>>> suggest that to the PDB validation people.
>>>
>>> Dale Tronrud
>>>
>>>> RSRZ, in my most humble of opinions, seems like one of those statistics
>>>>>
>>>> that
>>>
>>>> is far more useful in theory than reality.   Particularly for
>>>>> medium-resolution structures, the fit of each entire side chain to the
>>>>>
>>>> density
>>>
>>>> is likely to be imperfect because the density is imperfect, especially
>>>>>
>>>> toward
>>>
>>>> the tips of those side chains.
>>>>>
>>>>> Then again, it can be a good flag for bits of the structure worth a
>>>>>
>>>> second
>>>

Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-29 Thread Soisson, Stephen M
And again, has anyone seen the code to even know what's going on "under the 
hood"?

Thanks,
Steve

> On Nov 29, 2016, at 8:35 PM, dusan turk <dusan.t...@ijs.si> wrote:
> 
> Guys,
> 
> I have a two issues to add here:
> 
> 1. RSZS validation does not tolerate chain IDs longer than 1 character, 
> so it kills one of the very essential reasons why mmCIF format was 
> introduced (to enable deposition of large structures in a single file).
> 
> 2. I have noticed in validation report of my own structure (4PIA) that 
> the RSZS does not ALWAYS work right. For example, the "PHE 63" is well 
> resolved with a hole in the ring, yet the validation declares it as a 
> density outlier.  Besides there are several other residues in this 
> structure that "sit" well in the density, but are considered outliers, 
> whereas several, for which side chains the density is missing, are not 
> listed.
> 
> Has anyone else had a similar experience?
> 
> Taken all remarks together they suggest that something needs to be done 
> with RSZS software or density validation procedure to resolve these issues.
> 
> 
> 
> best,
> dusan
> 
> 
>> On 30/11/16 01:00, CCP4BB automatic digest system wrote:
>> Date:Mon, 28 Nov 2016 20:35:44 -0800
>> From:Pavel Afonine<pafon...@gmail.com>
>> Subject: Re: Calculation of RSRZ Score in PDB Validation Reports
>> 
>> I find Lothar's comments regarding H and RSRZ excellent! I would think of
>> it as a pretty much bug report. I hope developers at that end listen. This
>> goes very well in line with Phoebe's comment earlier today.
>> 
>> Pavel
>> 
>>> On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com>  wrote:
>>> 
>>> On 11/28/2016 12:52 PM,esse...@helix.nih.gov  wrote:
>>>>> I found that one can get RSRZ to go way down by loosening the geometry
>>>>> restraints.  The result is a crappy structure and I don't recommend
>>> doing
>>>>> that, but it does get all the atoms crammed into some sort of density.
>>>>   Your observation is quite interesting. I can add this: when we were
>>> working
>>>> with low to medium resolution structures, deleting the hydrogen atoms
>>> from
>>>> the model after refinement moved the very bad RSRZ statistic to about the
>>>> average in the given resolution range! Note, no re-refinement was done
>>> just
>>>> a simple deletion of the riding H-atoms. I find this to be odd given the
>>>> fact that, say the phenix developers favor the inclusion of H-atoms on
>>>> riding positions even in cases of low resolution structures. (I assume
>>> the
>>>> refmac5 and BUSTER-TNT developers have also a favorable opinion about
>>>> including H-atoms in the final model - and during refinement).
>>>> 
>>>> In my mind, it may be tempting to delete H-atoms to improve this
>>> statistic but
>>>> when you use them in refinement they should be included regardless of the
>>>> outcome of the RSRZ analysis.
>>>Of course, if you trick a validation statistic like this you haven't
>>> accomplished anything.  All you are saying is that one should rank RSRZ
>>> scores with and without hydrogen atoms separately.  Perhaps you should
>>> suggest that to the PDB validation people.
>>> 
>>> Dale Tronrud
>>>>> RSRZ, in my most humble of opinions, seems like one of those statistics
>>> that
>>>>> is far more useful in theory than reality.   Particularly for
>>>>> medium-resolution structures, the fit of each entire side chain to the
>>> density
>>>>> is likely to be imperfect because the density is imperfect, especially
>>> toward
>>>>> the tips of those side chains.
>>>>> 
>>>>> Then again, it can be a good flag for bits of the structure worth a
>>> second
>>>>> look in rebuilding.
>>>>   The latter is certainly true. It may mean that the developers of RSRZ
>>>> analysis need to tune it a bit to make it fully useful.
>>>> 
>>>> L.
>>>> 
>>>>> 
>>>>> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
>>>>> Bratkowski [mab...@cornell.edu]
>>>>> Sent: Tuesday, November 22, 2016 10:12 AM
>>>>> To:CCP4BB@JISCMAIL.AC.UK
>>>>> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
>>>

Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-29 Thread dusan turk

Guys,

I have a two issues to add here:

1. RSZS validation does not tolerate chain IDs longer than 1 character, 
so it kills one of the very essential reasons why mmCIF format was 
introduced (to enable deposition of large structures in a single file).


2. I have noticed in validation report of my own structure (4PIA) that 
the RSZS does not ALWAYS work right. For example, the "PHE 63" is well 
resolved with a hole in the ring, yet the validation declares it as a 
density outlier.  Besides there are several other residues in this 
structure that "sit" well in the density, but are considered outliers, 
whereas several, for which side chains the density is missing, are not 
listed.


Has anyone else had a similar experience?

Taken all remarks together they suggest that something needs to be done 
with RSZS software or density validation procedure to resolve these issues.




best,
dusan


On 30/11/16 01:00, CCP4BB automatic digest system wrote:

Date:Mon, 28 Nov 2016 20:35:44 -0800
From:Pavel Afonine<pafon...@gmail.com>
Subject: Re: Calculation of RSRZ Score in PDB Validation Reports

I find Lothar's comments regarding H and RSRZ excellent! I would think of
it as a pretty much bug report. I hope developers at that end listen. This
goes very well in line with Phoebe's comment earlier today.

Pavel

On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com>  wrote:


On 11/28/2016 12:52 PM,esse...@helix.nih.gov  wrote:

I found that one can get RSRZ to go way down by loosening the geometry
restraints.  The result is a crappy structure and I don't recommend

doing

that, but it does get all the atoms crammed into some sort of density.

   Your observation is quite interesting. I can add this: when we were

working

with low to medium resolution structures, deleting the hydrogen atoms

from

the model after refinement moved the very bad RSRZ statistic to about the
average in the given resolution range! Note, no re-refinement was done

just

a simple deletion of the riding H-atoms. I find this to be odd given the
fact that, say the phenix developers favor the inclusion of H-atoms on
riding positions even in cases of low resolution structures. (I assume

the

refmac5 and BUSTER-TNT developers have also a favorable opinion about
including H-atoms in the final model - and during refinement).

In my mind, it may be tempting to delete H-atoms to improve this

statistic but

when you use them in refinement they should be included regardless of the
outcome of the RSRZ analysis.

Of course, if you trick a validation statistic like this you haven't
accomplished anything.  All you are saying is that one should rank RSRZ
scores with and without hydrogen atoms separately.  Perhaps you should
suggest that to the PDB validation people.

Dale Tronrud

RSRZ, in my most humble of opinions, seems like one of those statistics

that

is far more useful in theory than reality.   Particularly for
medium-resolution structures, the fit of each entire side chain to the

density

is likely to be imperfect because the density is imperfect, especially

toward

the tips of those side chains.

Then again, it can be a good flag for bits of the structure worth a

second

look in rebuilding.

   The latter is certainly true. It may mean that the developers of RSRZ
analysis need to tune it a bit to make it fully useful.

L.



From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
Bratkowski [mab...@cornell.edu]
Sent: Tuesday, November 22, 2016 10:12 AM
To:CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

Hello all,

I was wondering if anyone knew how the RSRZ score was calculated in the
protein data bank validation reports and how useful of a metric this

actually

is for structure validation?  I am trying to improve this score on a

structure

that I am working on, but I'm not really sure where to begin.  From my
understanding, the score is based on the number of RSRZ outliers with a

score

2.  In my case, I have several residues with scores between 2 and 4,

but at

least by eye, fit to the electron density does not look that bad.

Hence, I

can't justify deleting them to try to improve the score.  If the score

is just

based on percent of outlier residues, then for instance wouldn't a

structure

with say 20 residues modeled with no corresponding electron density

have the

same score as a structure with 20 residues with RSRZ values of say 2.5?


I was also wondering how the resolution of the structure relates to the

score?

  Glancing through several pdb validation reports, I noticed some

structure

with low resolution (3.5 A or lower) with relatively high scores, while

others

with high resolution (2 A or higher) getting low scores.  It is

reasonable to

assume that a structure of lower than 3.5 A would be missing several

side

chains and may also have some ambiguous main chain 

Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-29 Thread Katherine Sippel
I started a ccp4 thread a few years ago about RSRZ score calculations
favoring trimmed side chains because they produce better scores. Based on
what I could find at that time, it looked like the density of your
structure was compared to the density of that residue type in submitted
PDBs of similar resolutions. However, I could be seriously mistaken.

Cheers,
Katherine

On Mon, Nov 28, 2016 at 11:46 PM, Ethan Merritt <merr...@u.washington.edu>
wrote:

> On Monday, 28 November 2016 08:35:44 PM Pavel Afonine wrote:
> > I find Lothar's comments regarding H and RSRZ excellent! I would think of
> > it as a pretty much bug report. I hope developers at that end listen.
> This
> > goes very well in line with Phoebe's comment earlier today.
>
> I guess I'm a bit surprised that adding or subtracting hydrogens from the
> model
> without re-refining or at least re-calculating Fc would affect RSRZ at all.
> I had thought that RSRZ was obtained by comparing density in an Fc map
> (or probably mFo-DFc) with the corresponding density in an Fo map.
> I thought that the coordinates were used only to determine the per-residue
> region of the map to be compared.
>
> Going back to the 2004 Kleywegt paper that the PDB cites for calculation of
> RSRZ I see that it's a bit ambiguous exactly what maps are being compared.
> So maybe I'm wrong and the current coordinates are used directly to get
> local "Fc density" by expanding 3D Gaussians without reference to a
> previously
> calculated map from refined phases.
>
> Can anyone clarify exactly what maps are being compared during wwPDB
> validation?
>
> Ethan
>
> >
> > Pavel
> >
> > On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com>
> wrote:
> >
> > > On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote:
> > > >> I found that one can get RSRZ to go way down by loosening the
> geometry
> > > >> restraints.  The result is a crappy structure and I don't recommend
> > > doing
> > > >> that, but it does get all the atoms crammed into some sort of
> density.
> > > >
> > > >   Your observation is quite interesting. I can add this: when we were
> > > working
> > > > with low to medium resolution structures, deleting the hydrogen atoms
> > > from
> > > > the model after refinement moved the very bad RSRZ statistic to
> about the
> > > > average in the given resolution range! Note, no re-refinement was
> done
> > > just
> > > > a simple deletion of the riding H-atoms. I find this to be odd given
> the
> > > > fact that, say the phenix developers favor the inclusion of H-atoms
> on
> > > > riding positions even in cases of low resolution structures. (I
> assume
> > > the
> > > > refmac5 and BUSTER-TNT developers have also a favorable opinion about
> > > > including H-atoms in the final model - and during refinement).
> > > >
> > > > In my mind, it may be tempting to delete H-atoms to improve this
> > > statistic but
> > > > when you use them in refinement they should be included regardless
> of the
> > > > outcome of the RSRZ analysis.
> > >
> > >Of course, if you trick a validation statistic like this you haven't
> > > accomplished anything.  All you are saying is that one should rank RSRZ
> > > scores with and without hydrogen atoms separately.  Perhaps you should
> > > suggest that to the PDB validation people.
> > >
> > > Dale Tronrud
> > > >
> > > >>
> > > >> RSRZ, in my most humble of opinions, seems like one of those
> statistics
> > > that
> > > >> is far more useful in theory than reality.   Particularly for
> > > >> medium-resolution structures, the fit of each entire side chain to
> the
> > > density
> > > >> is likely to be imperfect because the density is imperfect,
> especially
> > > toward
> > > >> the tips of those side chains.
> > > >>
> > > >> Then again, it can be a good flag for bits of the structure worth a
> > > second
> > > >> look in rebuilding.
> > > >
> > > >   The latter is certainly true. It may mean that the developers of
> RSRZ
> > > > analysis need to tune it a bit to make it fully useful.
> > > >
> > > > L.
> > > >
> > > >>
> > > >> 
> > > >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on beha

Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-29 Thread Soisson, Stephen M
Is the code publically available for the RSRZ calculation?

Thanks,

Steve

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ethan 
Merritt
Sent: Tuesday, November 29, 2016 12:46 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

On Monday, 28 November 2016 08:35:44 PM Pavel Afonine wrote:
> I find Lothar's comments regarding H and RSRZ excellent! I would think of
> it as a pretty much bug report. I hope developers at that end listen. This
> goes very well in line with Phoebe's comment earlier today.

I guess I'm a bit surprised that adding or subtracting hydrogens from the model
without re-refining or at least re-calculating Fc would affect RSRZ at all.
I had thought that RSRZ was obtained by comparing density in an Fc map
(or probably mFo-DFc) with the corresponding density in an Fo map.
I thought that the coordinates were used only to determine the per-residue
region of the map to be compared.

Going back to the 2004 Kleywegt paper that the PDB cites for calculation of
RSRZ I see that it's a bit ambiguous exactly what maps are being compared.
So maybe I'm wrong and the current coordinates are used directly to get
local "Fc density" by expanding 3D Gaussians without reference to a previously
calculated map from refined phases.  

Can anyone clarify exactly what maps are being compared during wwPDB
validation?

Ethan

> 
> Pavel
> 
> On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com> wrote:
> 
> > On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote:
> > >> I found that one can get RSRZ to go way down by loosening the geometry
> > >> restraints.  The result is a crappy structure and I don't recommend
> > doing
> > >> that, but it does get all the atoms crammed into some sort of density.
> > >
> > >   Your observation is quite interesting. I can add this: when we were
> > working
> > > with low to medium resolution structures, deleting the hydrogen atoms
> > from
> > > the model after refinement moved the very bad RSRZ statistic to about the
> > > average in the given resolution range! Note, no re-refinement was done
> > just
> > > a simple deletion of the riding H-atoms. I find this to be odd given the
> > > fact that, say the phenix developers favor the inclusion of H-atoms on
> > > riding positions even in cases of low resolution structures. (I assume
> > the
> > > refmac5 and BUSTER-TNT developers have also a favorable opinion about
> > > including H-atoms in the final model - and during refinement).
> > >
> > > In my mind, it may be tempting to delete H-atoms to improve this
> > statistic but
> > > when you use them in refinement they should be included regardless of the
> > > outcome of the RSRZ analysis.
> >
> >Of course, if you trick a validation statistic like this you haven't
> > accomplished anything.  All you are saying is that one should rank RSRZ
> > scores with and without hydrogen atoms separately.  Perhaps you should
> > suggest that to the PDB validation people.
> >
> > Dale Tronrud
> > >
> > >>
> > >> RSRZ, in my most humble of opinions, seems like one of those statistics
> > that
> > >> is far more useful in theory than reality.   Particularly for
> > >> medium-resolution structures, the fit of each entire side chain to the
> > density
> > >> is likely to be imperfect because the density is imperfect, especially
> > toward
> > >> the tips of those side chains.
> > >>
> > >> Then again, it can be a good flag for bits of the structure worth a
> > second
> > >> look in rebuilding.
> > >
> > >   The latter is certainly true. It may mean that the developers of RSRZ
> > > analysis need to tune it a bit to make it fully useful.
> > >
> > > L.
> > >
> > >>
> > >> 
> > >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
> > >> Bratkowski [mab...@cornell.edu]
> > >> Sent: Tuesday, November 22, 2016 10:12 AM
> > >> To: CCP4BB@JISCMAIL.AC.UK
> > >> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
> > >>
> > >> Hello all,
> > >>
> > >> I was wondering if anyone knew how the RSRZ score was calculated in the
> > >> protein data bank validation reports and how useful of a metric this
> > actually
> > >> is for structure validation?  I am trying to improve this score 

Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-28 Thread Pavel Afonine
I find Lothar's comments regarding H and RSRZ excellent! I would think of
it as a pretty much bug report. I hope developers at that end listen. This
goes very well in line with Phoebe's comment earlier today.

Pavel

On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com> wrote:

> On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote:
> >> I found that one can get RSRZ to go way down by loosening the geometry
> >> restraints.  The result is a crappy structure and I don't recommend
> doing
> >> that, but it does get all the atoms crammed into some sort of density.
> >
> >   Your observation is quite interesting. I can add this: when we were
> working
> > with low to medium resolution structures, deleting the hydrogen atoms
> from
> > the model after refinement moved the very bad RSRZ statistic to about the
> > average in the given resolution range! Note, no re-refinement was done
> just
> > a simple deletion of the riding H-atoms. I find this to be odd given the
> > fact that, say the phenix developers favor the inclusion of H-atoms on
> > riding positions even in cases of low resolution structures. (I assume
> the
> > refmac5 and BUSTER-TNT developers have also a favorable opinion about
> > including H-atoms in the final model - and during refinement).
> >
> > In my mind, it may be tempting to delete H-atoms to improve this
> statistic but
> > when you use them in refinement they should be included regardless of the
> > outcome of the RSRZ analysis.
>
>Of course, if you trick a validation statistic like this you haven't
> accomplished anything.  All you are saying is that one should rank RSRZ
> scores with and without hydrogen atoms separately.  Perhaps you should
> suggest that to the PDB validation people.
>
> Dale Tronrud
> >
> >>
> >> RSRZ, in my most humble of opinions, seems like one of those statistics
> that
> >> is far more useful in theory than reality.   Particularly for
> >> medium-resolution structures, the fit of each entire side chain to the
> density
> >> is likely to be imperfect because the density is imperfect, especially
> toward
> >> the tips of those side chains.
> >>
> >> Then again, it can be a good flag for bits of the structure worth a
> second
> >> look in rebuilding.
> >
> >   The latter is certainly true. It may mean that the developers of RSRZ
> > analysis need to tune it a bit to make it fully useful.
> >
> > L.
> >
> >>
> >> 
> >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
> >> Bratkowski [mab...@cornell.edu]
> >> Sent: Tuesday, November 22, 2016 10:12 AM
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
> >>
> >> Hello all,
> >>
> >> I was wondering if anyone knew how the RSRZ score was calculated in the
> >> protein data bank validation reports and how useful of a metric this
> actually
> >> is for structure validation?  I am trying to improve this score on a
> structure
> >> that I am working on, but I'm not really sure where to begin.  From my
> >> understanding, the score is based on the number of RSRZ outliers with a
> score
> >>> 2.  In my case, I have several residues with scores between 2 and 4,
> but at
> >> least by eye, fit to the electron density does not look that bad.
> Hence, I
> >> can't justify deleting them to try to improve the score.  If the score
> is just
> >> based on percent of outlier residues, then for instance wouldn't a
> structure
> >> with say 20 residues modeled with no corresponding electron density
> have the
> >> same score as a structure with 20 residues with RSRZ values of say 2.5?
> >>
> >>
> >> I was also wondering how the resolution of the structure relates to the
> score?
> >>  Glancing through several pdb validation reports, I noticed some
> structure
> >> with low resolution (3.5 A or lower) with relatively high scores, while
> others
> >> with high resolution (2 A or higher) getting low scores.  It is
> reasonable to
> >> assume that a structure of lower than 3.5 A would be missing several
> side
> >> chains and may also have some ambiguous main chain electron density,
> which
> >> should in theory increase the RSRZ score.  While of course every
> structure is
> >> different and the quality of it due to the rigor of the person building
> the
> >> model, I was wondering if there were any general trends related to
> resolution
> >> and RSRZ score.
> >>
> >> Thanks,
> >> Matt
> >>
> >
>


Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-28 Thread Dale Tronrud
On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote:
>> I found that one can get RSRZ to go way down by loosening the geometry
>> restraints.  The result is a crappy structure and I don't recommend doing
>> that, but it does get all the atoms crammed into some sort of density.
> 
>   Your observation is quite interesting. I can add this: when we were working
> with low to medium resolution structures, deleting the hydrogen atoms from
> the model after refinement moved the very bad RSRZ statistic to about the
> average in the given resolution range! Note, no re-refinement was done just
> a simple deletion of the riding H-atoms. I find this to be odd given the
> fact that, say the phenix developers favor the inclusion of H-atoms on
> riding positions even in cases of low resolution structures. (I assume the
> refmac5 and BUSTER-TNT developers have also a favorable opinion about
> including H-atoms in the final model - and during refinement).
> 
> In my mind, it may be tempting to delete H-atoms to improve this statistic but
> when you use them in refinement they should be included regardless of the
> outcome of the RSRZ analysis.

   Of course, if you trick a validation statistic like this you haven't
accomplished anything.  All you are saying is that one should rank RSRZ
scores with and without hydrogen atoms separately.  Perhaps you should
suggest that to the PDB validation people.

Dale Tronrud
> 
>>
>> RSRZ, in my most humble of opinions, seems like one of those statistics that
>> is far more useful in theory than reality.   Particularly for
>> medium-resolution structures, the fit of each entire side chain to the 
>> density
>> is likely to be imperfect because the density is imperfect, especially toward
>> the tips of those side chains.
>>
>> Then again, it can be a good flag for bits of the structure worth a second
>> look in rebuilding.
> 
>   The latter is certainly true. It may mean that the developers of RSRZ
> analysis need to tune it a bit to make it fully useful.
> 
> L.
> 
>>
>> 
>> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
>> Bratkowski [mab...@cornell.edu]
>> Sent: Tuesday, November 22, 2016 10:12 AM
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
>>
>> Hello all,
>>
>> I was wondering if anyone knew how the RSRZ score was calculated in the
>> protein data bank validation reports and how useful of a metric this actually
>> is for structure validation?  I am trying to improve this score on a 
>> structure
>> that I am working on, but I'm not really sure where to begin.  From my
>> understanding, the score is based on the number of RSRZ outliers with a score
>>> 2.  In my case, I have several residues with scores between 2 and 4, but at
>> least by eye, fit to the electron density does not look that bad.  Hence, I
>> can't justify deleting them to try to improve the score.  If the score is 
>> just
>> based on percent of outlier residues, then for instance wouldn't a structure
>> with say 20 residues modeled with no corresponding electron density have the
>> same score as a structure with 20 residues with RSRZ values of say 2.5?
>>
>>
>> I was also wondering how the resolution of the structure relates to the 
>> score?
>>  Glancing through several pdb validation reports, I noticed some structure
>> with low resolution (3.5 A or lower) with relatively high scores, while 
>> others
>> with high resolution (2 A or higher) getting low scores.  It is reasonable to
>> assume that a structure of lower than 3.5 A would be missing several side
>> chains and may also have some ambiguous main chain electron density, which
>> should in theory increase the RSRZ score.  While of course every structure is
>> different and the quality of it due to the rigor of the person building the
>> model, I was wondering if there were any general trends related to resolution
>> and RSRZ score.
>>
>> Thanks,
>> Matt
>>
> 


Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-28 Thread esserlo
> I found that one can get RSRZ to go way down by loosening the geometry
> restraints.  The result is a crappy structure and I don't recommend doing
> that, but it does get all the atoms crammed into some sort of density.

  Your observation is quite interesting. I can add this: when we were working
with low to medium resolution structures, deleting the hydrogen atoms from
the model after refinement moved the very bad RSRZ statistic to about the
average in the given resolution range! Note, no re-refinement was done just
a simple deletion of the riding H-atoms. I find this to be odd given the
fact that, say the phenix developers favor the inclusion of H-atoms on
riding positions even in cases of low resolution structures. (I assume the
refmac5 and BUSTER-TNT developers have also a favorable opinion about
including H-atoms in the final model - and during refinement).

In my mind, it may be tempting to delete H-atoms to improve this statistic but
when you use them in refinement they should be included regardless of the
outcome of the RSRZ analysis.

>
> RSRZ, in my most humble of opinions, seems like one of those statistics that
> is far more useful in theory than reality.   Particularly for
> medium-resolution structures, the fit of each entire side chain to the density
> is likely to be imperfect because the density is imperfect, especially toward
> the tips of those side chains.
>
> Then again, it can be a good flag for bits of the structure worth a second
> look in rebuilding.

  The latter is certainly true. It may mean that the developers of RSRZ
analysis need to tune it a bit to make it fully useful.

L.

>
> 
> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew
> Bratkowski [mab...@cornell.edu]
> Sent: Tuesday, November 22, 2016 10:12 AM
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
>
> Hello all,
>
> I was wondering if anyone knew how the RSRZ score was calculated in the
> protein data bank validation reports and how useful of a metric this actually
> is for structure validation?  I am trying to improve this score on a structure
> that I am working on, but I'm not really sure where to begin.  From my
> understanding, the score is based on the number of RSRZ outliers with a score
> >2.  In my case, I have several residues with scores between 2 and 4, but at
> least by eye, fit to the electron density does not look that bad.  Hence, I
> can't justify deleting them to try to improve the score.  If the score is just
> based on percent of outlier residues, then for instance wouldn't a structure
> with say 20 residues modeled with no corresponding electron density have the
> same score as a structure with 20 residues with RSRZ values of say 2.5?
>
>
> I was also wondering how the resolution of the structure relates to the score?
>  Glancing through several pdb validation reports, I noticed some structure
> with low resolution (3.5 A or lower) with relatively high scores, while others
> with high resolution (2 A or higher) getting low scores.  It is reasonable to
> assume that a structure of lower than 3.5 A would be missing several side
> chains and may also have some ambiguous main chain electron density, which
> should in theory increase the RSRZ score.  While of course every structure is
> different and the quality of it due to the rigor of the person building the
> model, I was wondering if there were any general trends related to resolution
> and RSRZ score.
>
> Thanks,
> Matt
>


Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-28 Thread Pavel Afonine
On Mon, Nov 28, 2016 at 10:17 AM, Phoebe A. Rice  wrote:

> RSRZ, in my most humble of opinions, seems like one of those statistics
> that is far more useful in theory than reality.
>

Can't agree more!
Pavel


Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-28 Thread Phoebe A. Rice
I found that one can get RSRZ to go way down by loosening the geometry 
restraints.  The result is a crappy structure and I don't recommend doing that, 
but it does get all the atoms crammed into some sort of density.

RSRZ, in my most humble of opinions, seems like one of those statistics that is 
far more useful in theory than reality.   Particularly for medium-resolution 
structures, the fit of each entire side chain to the density is likely to be 
imperfect because the density is imperfect, especially toward the tips of those 
side chains.

Then again, it can be a good flag for bits of the structure worth a second look 
in rebuilding.


From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew 
Bratkowski [mab...@cornell.edu]
Sent: Tuesday, November 22, 2016 10:12 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

Hello all,

I was wondering if anyone knew how the RSRZ score was calculated in the protein 
data bank validation reports and how useful of a metric this actually is for 
structure validation?  I am trying to improve this score on a structure that I 
am working on, but I'm not really sure where to begin.  From my understanding, 
the score is based on the number of RSRZ outliers with a score >2.  In my case, 
I have several residues with scores between 2 and 4, but at least by eye, fit 
to the electron density does not look that bad.  Hence, I can't justify 
deleting them to try to improve the score.  If the score is just based on 
percent of outlier residues, then for instance wouldn't a structure with say 20 
residues modeled with no corresponding electron density have the same score as 
a structure with 20 residues with RSRZ values of say 2.5?


I was also wondering how the resolution of the structure relates to the score?  
Glancing through several pdb validation reports, I noticed some structure with 
low resolution (3.5 A or lower) with relatively high scores, while others with 
high resolution (2 A or higher) getting low scores.  It is reasonable to assume 
that a structure of lower than 3.5 A would be missing several side chains and 
may also have some ambiguous main chain electron density, which should in 
theory increase the RSRZ score.  While of course every structure is different 
and the quality of it due to the rigor of the person building the model, I was 
wondering if there were any general trends related to resolution and RSRZ score.

Thanks,
Matt