Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
Dear all, I had this experience: going pedantically to the individual points the RSRZ and other validation statistics in the form were reporting - in a vast majority of the cases nothing was wrong at all. So it seems to be somewhat overdoing its job - not that this is bad on its own - but we are losing contrast quite a bit between the really serious issues on ... all the other. Jan Dohnalek On Wed, Nov 30, 2016 at 2:33 AM, dusan turk <dusan.t...@ijs.si> wrote: > Guys, > > I have a two issues to add here: > > 1. RSZS validation does not tolerate chain IDs longer than 1 character, so > it kills one of the very essential reasons why mmCIF format was introduced > (to enable deposition of large structures in a single file). > > 2. I have noticed in validation report of my own structure (4PIA) that the > RSZS does not ALWAYS work right. For example, the "PHE 63" is well resolved > with a hole in the ring, yet the validation declares it as a density > outlier. Besides there are several other residues in this structure that > "sit" well in the density, but are considered outliers, whereas several, > for which side chains the density is missing, are not listed. > > Has anyone else had a similar experience? > > Taken all remarks together they suggest that something needs to be done > with RSZS software or density validation procedure to resolve these issues. > > > > best, > dusan > > > On 30/11/16 01:00, CCP4BB automatic digest system wrote: > >> Date:Mon, 28 Nov 2016 20:35:44 -0800 >> From:Pavel Afonine<pafon...@gmail.com> >> Subject: Re: Calculation of RSRZ Score in PDB Validation Reports >> >> I find Lothar's comments regarding H and RSRZ excellent! I would think of >> it as a pretty much bug report. I hope developers at that end listen. This >> goes very well in line with Phoebe's comment earlier today. >> >> Pavel >> >> On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com> >> wrote: >> >> On 11/28/2016 12:52 PM,esse...@helix.nih.gov wrote: >>> >>>> I found that one can get RSRZ to go way down by loosening the geometry >>>>> restraints. The result is a crappy structure and I don't recommend >>>>> >>>> doing >>> >>>> that, but it does get all the atoms crammed into some sort of density. >>>>> >>>>Your observation is quite interesting. I can add this: when we were >>>> >>> working >>> >>>> with low to medium resolution structures, deleting the hydrogen atoms >>>> >>> from >>> >>>> the model after refinement moved the very bad RSRZ statistic to about >>>> the >>>> average in the given resolution range! Note, no re-refinement was done >>>> >>> just >>> >>>> a simple deletion of the riding H-atoms. I find this to be odd given the >>>> fact that, say the phenix developers favor the inclusion of H-atoms on >>>> riding positions even in cases of low resolution structures. (I assume >>>> >>> the >>> >>>> refmac5 and BUSTER-TNT developers have also a favorable opinion about >>>> including H-atoms in the final model - and during refinement). >>>> >>>> In my mind, it may be tempting to delete H-atoms to improve this >>>> >>> statistic but >>> >>>> when you use them in refinement they should be included regardless of >>>> the >>>> outcome of the RSRZ analysis. >>>> >>> Of course, if you trick a validation statistic like this you haven't >>> accomplished anything. All you are saying is that one should rank RSRZ >>> scores with and without hydrogen atoms separately. Perhaps you should >>> suggest that to the PDB validation people. >>> >>> Dale Tronrud >>> >>>> RSRZ, in my most humble of opinions, seems like one of those statistics >>>>> >>>> that >>> >>>> is far more useful in theory than reality. Particularly for >>>>> medium-resolution structures, the fit of each entire side chain to the >>>>> >>>> density >>> >>>> is likely to be imperfect because the density is imperfect, especially >>>>> >>>> toward >>> >>>> the tips of those side chains. >>>>> >>>>> Then again, it can be a good flag for bits of the structure worth a >>>>> >>>> second >>>
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
And again, has anyone seen the code to even know what's going on "under the hood"? Thanks, Steve > On Nov 29, 2016, at 8:35 PM, dusan turk <dusan.t...@ijs.si> wrote: > > Guys, > > I have a two issues to add here: > > 1. RSZS validation does not tolerate chain IDs longer than 1 character, > so it kills one of the very essential reasons why mmCIF format was > introduced (to enable deposition of large structures in a single file). > > 2. I have noticed in validation report of my own structure (4PIA) that > the RSZS does not ALWAYS work right. For example, the "PHE 63" is well > resolved with a hole in the ring, yet the validation declares it as a > density outlier. Besides there are several other residues in this > structure that "sit" well in the density, but are considered outliers, > whereas several, for which side chains the density is missing, are not > listed. > > Has anyone else had a similar experience? > > Taken all remarks together they suggest that something needs to be done > with RSZS software or density validation procedure to resolve these issues. > > > > best, > dusan > > >> On 30/11/16 01:00, CCP4BB automatic digest system wrote: >> Date:Mon, 28 Nov 2016 20:35:44 -0800 >> From:Pavel Afonine<pafon...@gmail.com> >> Subject: Re: Calculation of RSRZ Score in PDB Validation Reports >> >> I find Lothar's comments regarding H and RSRZ excellent! I would think of >> it as a pretty much bug report. I hope developers at that end listen. This >> goes very well in line with Phoebe's comment earlier today. >> >> Pavel >> >>> On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com> wrote: >>> >>> On 11/28/2016 12:52 PM,esse...@helix.nih.gov wrote: >>>>> I found that one can get RSRZ to go way down by loosening the geometry >>>>> restraints. The result is a crappy structure and I don't recommend >>> doing >>>>> that, but it does get all the atoms crammed into some sort of density. >>>> Your observation is quite interesting. I can add this: when we were >>> working >>>> with low to medium resolution structures, deleting the hydrogen atoms >>> from >>>> the model after refinement moved the very bad RSRZ statistic to about the >>>> average in the given resolution range! Note, no re-refinement was done >>> just >>>> a simple deletion of the riding H-atoms. I find this to be odd given the >>>> fact that, say the phenix developers favor the inclusion of H-atoms on >>>> riding positions even in cases of low resolution structures. (I assume >>> the >>>> refmac5 and BUSTER-TNT developers have also a favorable opinion about >>>> including H-atoms in the final model - and during refinement). >>>> >>>> In my mind, it may be tempting to delete H-atoms to improve this >>> statistic but >>>> when you use them in refinement they should be included regardless of the >>>> outcome of the RSRZ analysis. >>>Of course, if you trick a validation statistic like this you haven't >>> accomplished anything. All you are saying is that one should rank RSRZ >>> scores with and without hydrogen atoms separately. Perhaps you should >>> suggest that to the PDB validation people. >>> >>> Dale Tronrud >>>>> RSRZ, in my most humble of opinions, seems like one of those statistics >>> that >>>>> is far more useful in theory than reality. Particularly for >>>>> medium-resolution structures, the fit of each entire side chain to the >>> density >>>>> is likely to be imperfect because the density is imperfect, especially >>> toward >>>>> the tips of those side chains. >>>>> >>>>> Then again, it can be a good flag for bits of the structure worth a >>> second >>>>> look in rebuilding. >>>> The latter is certainly true. It may mean that the developers of RSRZ >>>> analysis need to tune it a bit to make it fully useful. >>>> >>>> L. >>>> >>>>> >>>>> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew >>>>> Bratkowski [mab...@cornell.edu] >>>>> Sent: Tuesday, November 22, 2016 10:12 AM >>>>> To:CCP4BB@JISCMAIL.AC.UK >>>>> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports >>>
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
Guys, I have a two issues to add here: 1. RSZS validation does not tolerate chain IDs longer than 1 character, so it kills one of the very essential reasons why mmCIF format was introduced (to enable deposition of large structures in a single file). 2. I have noticed in validation report of my own structure (4PIA) that the RSZS does not ALWAYS work right. For example, the "PHE 63" is well resolved with a hole in the ring, yet the validation declares it as a density outlier. Besides there are several other residues in this structure that "sit" well in the density, but are considered outliers, whereas several, for which side chains the density is missing, are not listed. Has anyone else had a similar experience? Taken all remarks together they suggest that something needs to be done with RSZS software or density validation procedure to resolve these issues. best, dusan On 30/11/16 01:00, CCP4BB automatic digest system wrote: Date:Mon, 28 Nov 2016 20:35:44 -0800 From:Pavel Afonine<pafon...@gmail.com> Subject: Re: Calculation of RSRZ Score in PDB Validation Reports I find Lothar's comments regarding H and RSRZ excellent! I would think of it as a pretty much bug report. I hope developers at that end listen. This goes very well in line with Phoebe's comment earlier today. Pavel On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud<de...@daletronrud.com> wrote: On 11/28/2016 12:52 PM,esse...@helix.nih.gov wrote: I found that one can get RSRZ to go way down by loosening the geometry restraints. The result is a crappy structure and I don't recommend doing that, but it does get all the atoms crammed into some sort of density. Your observation is quite interesting. I can add this: when we were working with low to medium resolution structures, deleting the hydrogen atoms from the model after refinement moved the very bad RSRZ statistic to about the average in the given resolution range! Note, no re-refinement was done just a simple deletion of the riding H-atoms. I find this to be odd given the fact that, say the phenix developers favor the inclusion of H-atoms on riding positions even in cases of low resolution structures. (I assume the refmac5 and BUSTER-TNT developers have also a favorable opinion about including H-atoms in the final model - and during refinement). In my mind, it may be tempting to delete H-atoms to improve this statistic but when you use them in refinement they should be included regardless of the outcome of the RSRZ analysis. Of course, if you trick a validation statistic like this you haven't accomplished anything. All you are saying is that one should rank RSRZ scores with and without hydrogen atoms separately. Perhaps you should suggest that to the PDB validation people. Dale Tronrud RSRZ, in my most humble of opinions, seems like one of those statistics that is far more useful in theory than reality. Particularly for medium-resolution structures, the fit of each entire side chain to the density is likely to be imperfect because the density is imperfect, especially toward the tips of those side chains. Then again, it can be a good flag for bits of the structure worth a second look in rebuilding. The latter is certainly true. It may mean that the developers of RSRZ analysis need to tune it a bit to make it fully useful. L. From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew Bratkowski [mab...@cornell.edu] Sent: Tuesday, November 22, 2016 10:12 AM To:CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports Hello all, I was wondering if anyone knew how the RSRZ score was calculated in the protein data bank validation reports and how useful of a metric this actually is for structure validation? I am trying to improve this score on a structure that I am working on, but I'm not really sure where to begin. From my understanding, the score is based on the number of RSRZ outliers with a score 2. In my case, I have several residues with scores between 2 and 4, but at least by eye, fit to the electron density does not look that bad. Hence, I can't justify deleting them to try to improve the score. If the score is just based on percent of outlier residues, then for instance wouldn't a structure with say 20 residues modeled with no corresponding electron density have the same score as a structure with 20 residues with RSRZ values of say 2.5? I was also wondering how the resolution of the structure relates to the score? Glancing through several pdb validation reports, I noticed some structure with low resolution (3.5 A or lower) with relatively high scores, while others with high resolution (2 A or higher) getting low scores. It is reasonable to assume that a structure of lower than 3.5 A would be missing several side chains and may also have some ambiguous main chain
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
I started a ccp4 thread a few years ago about RSRZ score calculations favoring trimmed side chains because they produce better scores. Based on what I could find at that time, it looked like the density of your structure was compared to the density of that residue type in submitted PDBs of similar resolutions. However, I could be seriously mistaken. Cheers, Katherine On Mon, Nov 28, 2016 at 11:46 PM, Ethan Merritt <merr...@u.washington.edu> wrote: > On Monday, 28 November 2016 08:35:44 PM Pavel Afonine wrote: > > I find Lothar's comments regarding H and RSRZ excellent! I would think of > > it as a pretty much bug report. I hope developers at that end listen. > This > > goes very well in line with Phoebe's comment earlier today. > > I guess I'm a bit surprised that adding or subtracting hydrogens from the > model > without re-refining or at least re-calculating Fc would affect RSRZ at all. > I had thought that RSRZ was obtained by comparing density in an Fc map > (or probably mFo-DFc) with the corresponding density in an Fo map. > I thought that the coordinates were used only to determine the per-residue > region of the map to be compared. > > Going back to the 2004 Kleywegt paper that the PDB cites for calculation of > RSRZ I see that it's a bit ambiguous exactly what maps are being compared. > So maybe I'm wrong and the current coordinates are used directly to get > local "Fc density" by expanding 3D Gaussians without reference to a > previously > calculated map from refined phases. > > Can anyone clarify exactly what maps are being compared during wwPDB > validation? > > Ethan > > > > > Pavel > > > > On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com> > wrote: > > > > > On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote: > > > >> I found that one can get RSRZ to go way down by loosening the > geometry > > > >> restraints. The result is a crappy structure and I don't recommend > > > doing > > > >> that, but it does get all the atoms crammed into some sort of > density. > > > > > > > > Your observation is quite interesting. I can add this: when we were > > > working > > > > with low to medium resolution structures, deleting the hydrogen atoms > > > from > > > > the model after refinement moved the very bad RSRZ statistic to > about the > > > > average in the given resolution range! Note, no re-refinement was > done > > > just > > > > a simple deletion of the riding H-atoms. I find this to be odd given > the > > > > fact that, say the phenix developers favor the inclusion of H-atoms > on > > > > riding positions even in cases of low resolution structures. (I > assume > > > the > > > > refmac5 and BUSTER-TNT developers have also a favorable opinion about > > > > including H-atoms in the final model - and during refinement). > > > > > > > > In my mind, it may be tempting to delete H-atoms to improve this > > > statistic but > > > > when you use them in refinement they should be included regardless > of the > > > > outcome of the RSRZ analysis. > > > > > >Of course, if you trick a validation statistic like this you haven't > > > accomplished anything. All you are saying is that one should rank RSRZ > > > scores with and without hydrogen atoms separately. Perhaps you should > > > suggest that to the PDB validation people. > > > > > > Dale Tronrud > > > > > > > >> > > > >> RSRZ, in my most humble of opinions, seems like one of those > statistics > > > that > > > >> is far more useful in theory than reality. Particularly for > > > >> medium-resolution structures, the fit of each entire side chain to > the > > > density > > > >> is likely to be imperfect because the density is imperfect, > especially > > > toward > > > >> the tips of those side chains. > > > >> > > > >> Then again, it can be a good flag for bits of the structure worth a > > > second > > > >> look in rebuilding. > > > > > > > > The latter is certainly true. It may mean that the developers of > RSRZ > > > > analysis need to tune it a bit to make it fully useful. > > > > > > > > L. > > > > > > > >> > > > >> > > > >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on beha
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
Is the code publically available for the RSRZ calculation? Thanks, Steve -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ethan Merritt Sent: Tuesday, November 29, 2016 12:46 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports On Monday, 28 November 2016 08:35:44 PM Pavel Afonine wrote: > I find Lothar's comments regarding H and RSRZ excellent! I would think of > it as a pretty much bug report. I hope developers at that end listen. This > goes very well in line with Phoebe's comment earlier today. I guess I'm a bit surprised that adding or subtracting hydrogens from the model without re-refining or at least re-calculating Fc would affect RSRZ at all. I had thought that RSRZ was obtained by comparing density in an Fc map (or probably mFo-DFc) with the corresponding density in an Fo map. I thought that the coordinates were used only to determine the per-residue region of the map to be compared. Going back to the 2004 Kleywegt paper that the PDB cites for calculation of RSRZ I see that it's a bit ambiguous exactly what maps are being compared. So maybe I'm wrong and the current coordinates are used directly to get local "Fc density" by expanding 3D Gaussians without reference to a previously calculated map from refined phases. Can anyone clarify exactly what maps are being compared during wwPDB validation? Ethan > > Pavel > > On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com> wrote: > > > On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote: > > >> I found that one can get RSRZ to go way down by loosening the geometry > > >> restraints. The result is a crappy structure and I don't recommend > > doing > > >> that, but it does get all the atoms crammed into some sort of density. > > > > > > Your observation is quite interesting. I can add this: when we were > > working > > > with low to medium resolution structures, deleting the hydrogen atoms > > from > > > the model after refinement moved the very bad RSRZ statistic to about the > > > average in the given resolution range! Note, no re-refinement was done > > just > > > a simple deletion of the riding H-atoms. I find this to be odd given the > > > fact that, say the phenix developers favor the inclusion of H-atoms on > > > riding positions even in cases of low resolution structures. (I assume > > the > > > refmac5 and BUSTER-TNT developers have also a favorable opinion about > > > including H-atoms in the final model - and during refinement). > > > > > > In my mind, it may be tempting to delete H-atoms to improve this > > statistic but > > > when you use them in refinement they should be included regardless of the > > > outcome of the RSRZ analysis. > > > >Of course, if you trick a validation statistic like this you haven't > > accomplished anything. All you are saying is that one should rank RSRZ > > scores with and without hydrogen atoms separately. Perhaps you should > > suggest that to the PDB validation people. > > > > Dale Tronrud > > > > > >> > > >> RSRZ, in my most humble of opinions, seems like one of those statistics > > that > > >> is far more useful in theory than reality. Particularly for > > >> medium-resolution structures, the fit of each entire side chain to the > > density > > >> is likely to be imperfect because the density is imperfect, especially > > toward > > >> the tips of those side chains. > > >> > > >> Then again, it can be a good flag for bits of the structure worth a > > second > > >> look in rebuilding. > > > > > > The latter is certainly true. It may mean that the developers of RSRZ > > > analysis need to tune it a bit to make it fully useful. > > > > > > L. > > > > > >> > > >> > > >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew > > >> Bratkowski [mab...@cornell.edu] > > >> Sent: Tuesday, November 22, 2016 10:12 AM > > >> To: CCP4BB@JISCMAIL.AC.UK > > >> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports > > >> > > >> Hello all, > > >> > > >> I was wondering if anyone knew how the RSRZ score was calculated in the > > >> protein data bank validation reports and how useful of a metric this > > actually > > >> is for structure validation? I am trying to improve this score
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
I find Lothar's comments regarding H and RSRZ excellent! I would think of it as a pretty much bug report. I hope developers at that end listen. This goes very well in line with Phoebe's comment earlier today. Pavel On Mon, Nov 28, 2016 at 2:51 PM, Dale Tronrud <de...@daletronrud.com> wrote: > On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote: > >> I found that one can get RSRZ to go way down by loosening the geometry > >> restraints. The result is a crappy structure and I don't recommend > doing > >> that, but it does get all the atoms crammed into some sort of density. > > > > Your observation is quite interesting. I can add this: when we were > working > > with low to medium resolution structures, deleting the hydrogen atoms > from > > the model after refinement moved the very bad RSRZ statistic to about the > > average in the given resolution range! Note, no re-refinement was done > just > > a simple deletion of the riding H-atoms. I find this to be odd given the > > fact that, say the phenix developers favor the inclusion of H-atoms on > > riding positions even in cases of low resolution structures. (I assume > the > > refmac5 and BUSTER-TNT developers have also a favorable opinion about > > including H-atoms in the final model - and during refinement). > > > > In my mind, it may be tempting to delete H-atoms to improve this > statistic but > > when you use them in refinement they should be included regardless of the > > outcome of the RSRZ analysis. > >Of course, if you trick a validation statistic like this you haven't > accomplished anything. All you are saying is that one should rank RSRZ > scores with and without hydrogen atoms separately. Perhaps you should > suggest that to the PDB validation people. > > Dale Tronrud > > > >> > >> RSRZ, in my most humble of opinions, seems like one of those statistics > that > >> is far more useful in theory than reality. Particularly for > >> medium-resolution structures, the fit of each entire side chain to the > density > >> is likely to be imperfect because the density is imperfect, especially > toward > >> the tips of those side chains. > >> > >> Then again, it can be a good flag for bits of the structure worth a > second > >> look in rebuilding. > > > > The latter is certainly true. It may mean that the developers of RSRZ > > analysis need to tune it a bit to make it fully useful. > > > > L. > > > >> > >> > >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew > >> Bratkowski [mab...@cornell.edu] > >> Sent: Tuesday, November 22, 2016 10:12 AM > >> To: CCP4BB@JISCMAIL.AC.UK > >> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports > >> > >> Hello all, > >> > >> I was wondering if anyone knew how the RSRZ score was calculated in the > >> protein data bank validation reports and how useful of a metric this > actually > >> is for structure validation? I am trying to improve this score on a > structure > >> that I am working on, but I'm not really sure where to begin. From my > >> understanding, the score is based on the number of RSRZ outliers with a > score > >>> 2. In my case, I have several residues with scores between 2 and 4, > but at > >> least by eye, fit to the electron density does not look that bad. > Hence, I > >> can't justify deleting them to try to improve the score. If the score > is just > >> based on percent of outlier residues, then for instance wouldn't a > structure > >> with say 20 residues modeled with no corresponding electron density > have the > >> same score as a structure with 20 residues with RSRZ values of say 2.5? > >> > >> > >> I was also wondering how the resolution of the structure relates to the > score? > >> Glancing through several pdb validation reports, I noticed some > structure > >> with low resolution (3.5 A or lower) with relatively high scores, while > others > >> with high resolution (2 A or higher) getting low scores. It is > reasonable to > >> assume that a structure of lower than 3.5 A would be missing several > side > >> chains and may also have some ambiguous main chain electron density, > which > >> should in theory increase the RSRZ score. While of course every > structure is > >> different and the quality of it due to the rigor of the person building > the > >> model, I was wondering if there were any general trends related to > resolution > >> and RSRZ score. > >> > >> Thanks, > >> Matt > >> > > >
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
On 11/28/2016 12:52 PM, esse...@helix.nih.gov wrote: >> I found that one can get RSRZ to go way down by loosening the geometry >> restraints. The result is a crappy structure and I don't recommend doing >> that, but it does get all the atoms crammed into some sort of density. > > Your observation is quite interesting. I can add this: when we were working > with low to medium resolution structures, deleting the hydrogen atoms from > the model after refinement moved the very bad RSRZ statistic to about the > average in the given resolution range! Note, no re-refinement was done just > a simple deletion of the riding H-atoms. I find this to be odd given the > fact that, say the phenix developers favor the inclusion of H-atoms on > riding positions even in cases of low resolution structures. (I assume the > refmac5 and BUSTER-TNT developers have also a favorable opinion about > including H-atoms in the final model - and during refinement). > > In my mind, it may be tempting to delete H-atoms to improve this statistic but > when you use them in refinement they should be included regardless of the > outcome of the RSRZ analysis. Of course, if you trick a validation statistic like this you haven't accomplished anything. All you are saying is that one should rank RSRZ scores with and without hydrogen atoms separately. Perhaps you should suggest that to the PDB validation people. Dale Tronrud > >> >> RSRZ, in my most humble of opinions, seems like one of those statistics that >> is far more useful in theory than reality. Particularly for >> medium-resolution structures, the fit of each entire side chain to the >> density >> is likely to be imperfect because the density is imperfect, especially toward >> the tips of those side chains. >> >> Then again, it can be a good flag for bits of the structure worth a second >> look in rebuilding. > > The latter is certainly true. It may mean that the developers of RSRZ > analysis need to tune it a bit to make it fully useful. > > L. > >> >> >> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew >> Bratkowski [mab...@cornell.edu] >> Sent: Tuesday, November 22, 2016 10:12 AM >> To: CCP4BB@JISCMAIL.AC.UK >> Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports >> >> Hello all, >> >> I was wondering if anyone knew how the RSRZ score was calculated in the >> protein data bank validation reports and how useful of a metric this actually >> is for structure validation? I am trying to improve this score on a >> structure >> that I am working on, but I'm not really sure where to begin. From my >> understanding, the score is based on the number of RSRZ outliers with a score >>> 2. In my case, I have several residues with scores between 2 and 4, but at >> least by eye, fit to the electron density does not look that bad. Hence, I >> can't justify deleting them to try to improve the score. If the score is >> just >> based on percent of outlier residues, then for instance wouldn't a structure >> with say 20 residues modeled with no corresponding electron density have the >> same score as a structure with 20 residues with RSRZ values of say 2.5? >> >> >> I was also wondering how the resolution of the structure relates to the >> score? >> Glancing through several pdb validation reports, I noticed some structure >> with low resolution (3.5 A or lower) with relatively high scores, while >> others >> with high resolution (2 A or higher) getting low scores. It is reasonable to >> assume that a structure of lower than 3.5 A would be missing several side >> chains and may also have some ambiguous main chain electron density, which >> should in theory increase the RSRZ score. While of course every structure is >> different and the quality of it due to the rigor of the person building the >> model, I was wondering if there were any general trends related to resolution >> and RSRZ score. >> >> Thanks, >> Matt >> >
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
> I found that one can get RSRZ to go way down by loosening the geometry > restraints. The result is a crappy structure and I don't recommend doing > that, but it does get all the atoms crammed into some sort of density. Your observation is quite interesting. I can add this: when we were working with low to medium resolution structures, deleting the hydrogen atoms from the model after refinement moved the very bad RSRZ statistic to about the average in the given resolution range! Note, no re-refinement was done just a simple deletion of the riding H-atoms. I find this to be odd given the fact that, say the phenix developers favor the inclusion of H-atoms on riding positions even in cases of low resolution structures. (I assume the refmac5 and BUSTER-TNT developers have also a favorable opinion about including H-atoms in the final model - and during refinement). In my mind, it may be tempting to delete H-atoms to improve this statistic but when you use them in refinement they should be included regardless of the outcome of the RSRZ analysis. > > RSRZ, in my most humble of opinions, seems like one of those statistics that > is far more useful in theory than reality. Particularly for > medium-resolution structures, the fit of each entire side chain to the density > is likely to be imperfect because the density is imperfect, especially toward > the tips of those side chains. > > Then again, it can be a good flag for bits of the structure worth a second > look in rebuilding. The latter is certainly true. It may mean that the developers of RSRZ analysis need to tune it a bit to make it fully useful. L. > > > From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew > Bratkowski [mab...@cornell.edu] > Sent: Tuesday, November 22, 2016 10:12 AM > To: CCP4BB@JISCMAIL.AC.UK > Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports > > Hello all, > > I was wondering if anyone knew how the RSRZ score was calculated in the > protein data bank validation reports and how useful of a metric this actually > is for structure validation? I am trying to improve this score on a structure > that I am working on, but I'm not really sure where to begin. From my > understanding, the score is based on the number of RSRZ outliers with a score > >2. In my case, I have several residues with scores between 2 and 4, but at > least by eye, fit to the electron density does not look that bad. Hence, I > can't justify deleting them to try to improve the score. If the score is just > based on percent of outlier residues, then for instance wouldn't a structure > with say 20 residues modeled with no corresponding electron density have the > same score as a structure with 20 residues with RSRZ values of say 2.5? > > > I was also wondering how the resolution of the structure relates to the score? > Glancing through several pdb validation reports, I noticed some structure > with low resolution (3.5 A or lower) with relatively high scores, while others > with high resolution (2 A or higher) getting low scores. It is reasonable to > assume that a structure of lower than 3.5 A would be missing several side > chains and may also have some ambiguous main chain electron density, which > should in theory increase the RSRZ score. While of course every structure is > different and the quality of it due to the rigor of the person building the > model, I was wondering if there were any general trends related to resolution > and RSRZ score. > > Thanks, > Matt >
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
On Mon, Nov 28, 2016 at 10:17 AM, Phoebe A. Ricewrote: > RSRZ, in my most humble of opinions, seems like one of those statistics > that is far more useful in theory than reality. > Can't agree more! Pavel
Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports
I found that one can get RSRZ to go way down by loosening the geometry restraints. The result is a crappy structure and I don't recommend doing that, but it does get all the atoms crammed into some sort of density. RSRZ, in my most humble of opinions, seems like one of those statistics that is far more useful in theory than reality. Particularly for medium-resolution structures, the fit of each entire side chain to the density is likely to be imperfect because the density is imperfect, especially toward the tips of those side chains. Then again, it can be a good flag for bits of the structure worth a second look in rebuilding. From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Matthew Bratkowski [mab...@cornell.edu] Sent: Tuesday, November 22, 2016 10:12 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports Hello all, I was wondering if anyone knew how the RSRZ score was calculated in the protein data bank validation reports and how useful of a metric this actually is for structure validation? I am trying to improve this score on a structure that I am working on, but I'm not really sure where to begin. From my understanding, the score is based on the number of RSRZ outliers with a score >2. In my case, I have several residues with scores between 2 and 4, but at least by eye, fit to the electron density does not look that bad. Hence, I can't justify deleting them to try to improve the score. If the score is just based on percent of outlier residues, then for instance wouldn't a structure with say 20 residues modeled with no corresponding electron density have the same score as a structure with 20 residues with RSRZ values of say 2.5? I was also wondering how the resolution of the structure relates to the score? Glancing through several pdb validation reports, I noticed some structure with low resolution (3.5 A or lower) with relatively high scores, while others with high resolution (2 A or higher) getting low scores. It is reasonable to assume that a structure of lower than 3.5 A would be missing several side chains and may also have some ambiguous main chain electron density, which should in theory increase the RSRZ score. While of course every structure is different and the quality of it due to the rigor of the person building the model, I was wondering if there were any general trends related to resolution and RSRZ score. Thanks, Matt