Re: [ccp4bb] R too low?

2013-07-05 Thread Eleanor Dodson
You have obviously solved this problem, but one thing that can change
apparent Rfactors is the number of reflections accepted..
If one gives you 5% more very weak reflections say, then those will
inevitably have high Rfactors and this can increase the apparent Rfactor
without changing the map appearance much..
Eleanor



On 27 June 2013 21:26, Roberts, Sue A - (suer) s...@email.arizona.eduwrote:

 Hello Everone,

 Thanks for all the help.  The key to finding the problem was following up
 on Tim Gruene's suggestion to compare the data sets directly.  It appears
 that an error occurred during conversion from I to F - until I find the log
 file for the conversion, I can't guess what was done.

 Longer version:

 When I compared the good and bad data sets, R was about 0.15, instead
 of the 0.07 I was expecting.

 Yesterday, I reintegrated the images using the same program that generated
 the bad data (CrystalClear - sorry to be opaque but I didn't want to
 inspire a lot of discussion about various integration programs when I was
 pretty sure the program wasn't at fault.), and ended up with a data set
 that agreed with the good data (XDS).  (Yeah, I should've done this
 before sending a message to ccp4bb). The R for scaling the new CC dataset
 and the XDS dataset was 0.07 and refinement behaved as expected and agreed
 with that of XDS.

 I have been unable to find the log file for the conversion from integrated
 I to mtz F (it's on some computer somewhere, I'm sure), but I did find the
 original ScalAveraged.ref file for the bad data and reimported that using
 the import scaled data task in ccp4i.   That data set is also good.  So, I
 conclude that something was done wrong during import to ccp4.  Tim
 suggested that perhaps the data was converted twice to amplitudes, perhaps
 that's it.  Anyway, now I know where the problem arose.

 Several people suggested checking statistics using phenix polygon and
 other analysis tools in phenix.  I agree that those are nice tools (and we
 had done that), however, they only tell you how your statistics are
 different from the median and often don't give any hints as to how any
 problems might have arisen.

 Again, thanks for all the help.

 Sue


 On Jun 26, 2013, at 8:54 AM, Tim Gruene wrote:

  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  Dear Sue,
 
  if you made your rmsd (bonds) 20-30 times smaller I would agree they
  were not too loose. 0.14A is pretty high. So two suggestions:
  a) check the molprobity report of your PDB if its geometry is sane
  b) check the CC plot of one data set against the other one to check if
  the problem  is due to two different data or due to the PDB file (xprep
  can do this plot conveniently).
 
  Did you check if you converted the data twice to amplitudes, or maybe
  not at all?
 
  Best,
  Tim
 
  On 06/26/2013 05:44 PM, Roberts, Sue A - (suer) wrote:
  Hello Everyone
 
  I have two data sets, from the same crystal form (space group P32)
  of the same protein, collected at 100 K at SSRL, about 2.2 A
  resolution, that refining to R = 0.14, Rf = 0.26 (refmac/TLS).
  This is a molecular replacement solution, from a model with about
  40% homology (after MR density was apparent for some missing or
  misbuilt residues, so I don't think the structure is stuck in the
  wrong place.  The Fo-Fc map is essentially featureless.  The 2Fo-Fc
  map doesn't look as good as it should - for instance, there are
  very few water molecules to be found.  The data reduction
  statistics look OK, the resolution cutoff is pretty conservative.
  There is one molecule in the asymmetric unit, so no NCS.  There is
  no twinning either.
 
  It seemed to me that the R is too low, not Rf too high.  More
  normally, R ends up about .18 - .20 for a data set at this
  resolution.
 
  I reprocessed the images with a different data processing program
  and redid the MR. The data reduction statistics look similar, the
  resolution is the same, but now the structure refines to R = 0.20,
  Rf = 0.24 (same free R set of reflections chosen, still
  refmac/TLS.) The maps look more normal. Further rebuilding took us
  to R = 0.18, Rf = 0.22
 
  So, the question I have (and that I've been asked by the student
  and PI) is:  What was the problem with the original data set?
  What should I be looking for in the data reduction log files, for
  instance, or in the refinement log?  The large R - free R spread
  is characteristic of overfitting, but the geometry is not too
  loose (rmsd bonds = 0.14), there are plenty of reflections (both
  working and free).
 
  Can anyone point me toward a reason R would be low?
 
  Thanks
 
  Sue
 
 
  Dr. Sue A. Roberts Dept. of Chemistry and Biochemistry University
  of Arizona 1041 E. Lowell St.,  Tucson, AZ 85721 Phone: 520 621
  8171 or 520 621 4168 s...@email.arizona.edu
  http://www.cbc.arizona.edu/xray or
  http://www.cbc.arizona.edu/facilities/x-ray_diffraction
 
 
 
  - --
  - --
  Dr Tim Gruene
  Institut fuer anorganische 

Re: [ccp4bb] R too low?

2013-06-27 Thread Roberts, Sue A - (suer)
Hello Everone,

Thanks for all the help.  The key to finding the problem was following up on 
Tim Gruene's suggestion to compare the data sets directly.  It appears that an 
error occurred during conversion from I to F - until I find the log file for 
the conversion, I can't guess what was done.

Longer version:

When I compared the good and bad data sets, R was about 0.15, instead of 
the 0.07 I was expecting.

Yesterday, I reintegrated the images using the same program that generated the 
bad data (CrystalClear - sorry to be opaque but I didn't want to inspire a 
lot of discussion about various integration programs when I was pretty sure the 
program wasn't at fault.), and ended up with a data set that agreed with the 
good data (XDS).  (Yeah, I should've done this before sending a message to 
ccp4bb). The R for scaling the new CC dataset and the XDS dataset was 0.07 and 
refinement behaved as expected and agreed with that of XDS. 

I have been unable to find the log file for the conversion from integrated I to 
mtz F (it's on some computer somewhere, I'm sure), but I did find the original 
ScalAveraged.ref file for the bad data and reimported that using the import 
scaled data task in ccp4i.   That data set is also good.  So, I conclude that 
something was done wrong during import to ccp4.  Tim suggested that perhaps the 
data was converted twice to amplitudes, perhaps that's it.  Anyway, now I know 
where the problem arose.

Several people suggested checking statistics using phenix polygon and other 
analysis tools in phenix.  I agree that those are nice tools (and we had done 
that), however, they only tell you how your statistics are different from the 
median and often don't give any hints as to how any problems might have arisen.

Again, thanks for all the help.

Sue

 
On Jun 26, 2013, at 8:54 AM, Tim Gruene wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Dear Sue,
 
 if you made your rmsd (bonds) 20-30 times smaller I would agree they
 were not too loose. 0.14A is pretty high. So two suggestions:
 a) check the molprobity report of your PDB if its geometry is sane
 b) check the CC plot of one data set against the other one to check if
 the problem  is due to two different data or due to the PDB file (xprep
 can do this plot conveniently).
 
 Did you check if you converted the data twice to amplitudes, or maybe
 not at all?
 
 Best,
 Tim
 
 On 06/26/2013 05:44 PM, Roberts, Sue A - (suer) wrote:
 Hello Everyone
 
 I have two data sets, from the same crystal form (space group P32)
 of the same protein, collected at 100 K at SSRL, about 2.2 A
 resolution, that refining to R = 0.14, Rf = 0.26 (refmac/TLS).
 This is a molecular replacement solution, from a model with about
 40% homology (after MR density was apparent for some missing or
 misbuilt residues, so I don't think the structure is stuck in the
 wrong place.  The Fo-Fc map is essentially featureless.  The 2Fo-Fc
 map doesn't look as good as it should - for instance, there are
 very few water molecules to be found.  The data reduction
 statistics look OK, the resolution cutoff is pretty conservative.
 There is one molecule in the asymmetric unit, so no NCS.  There is
 no twinning either.
 
 It seemed to me that the R is too low, not Rf too high.  More 
 normally, R ends up about .18 - .20 for a data set at this 
 resolution.
 
 I reprocessed the images with a different data processing program
 and redid the MR. The data reduction statistics look similar, the 
 resolution is the same, but now the structure refines to R = 0.20,
 Rf = 0.24 (same free R set of reflections chosen, still
 refmac/TLS.) The maps look more normal. Further rebuilding took us
 to R = 0.18, Rf = 0.22
 
 So, the question I have (and that I've been asked by the student
 and PI) is:  What was the problem with the original data set?
 What should I be looking for in the data reduction log files, for 
 instance, or in the refinement log?  The large R - free R spread
 is characteristic of overfitting, but the geometry is not too
 loose (rmsd bonds = 0.14), there are plenty of reflections (both
 working and free).
 
 Can anyone point me toward a reason R would be low?
 
 Thanks
 
 Sue
 
 
 Dr. Sue A. Roberts Dept. of Chemistry and Biochemistry University
 of Arizona 1041 E. Lowell St.,  Tucson, AZ 85721 Phone: 520 621
 8171 or 520 621 4168 s...@email.arizona.edu
 http://www.cbc.arizona.edu/xray or
 http://www.cbc.arizona.edu/facilities/x-ray_diffraction
 
 
 
 - -- 
 - --
 Dr Tim Gruene
 Institut fuer anorganische Chemie
 Tammannstr. 4
 D-37077 Goettingen
 
 GPG Key ID = A46BEE1A
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iD8DBQFRyw6vUxlJ7aRr7hoRAq4HAKCJJf+FfRVT7u3UOrty0vTOFMN+mgCgtHz8
 MYe+23hH+MKy/7E/h2w25+Q=
 =WAsD
 -END PGP SIGNATURE-

Dr. Sue A. Roberts
Dept. of Chemistry and Biochemistry
University of Arizona
1041 E. Lowell St.,  Tucson, AZ 85721
Phone: 520 621 

[ccp4bb] R too low?

2013-06-26 Thread Roberts, Sue A - (suer)
Hello Everyone

I have two data sets, from the same crystal form (space group P32) of the same 
protein, collected at 100 K at SSRL, about 2.2 A resolution, that refining to R 
= 0.14, Rf = 0.26 (refmac/TLS).  This is a molecular replacement solution, from 
a model with about 40% homology (after MR density was apparent for some missing 
or misbuilt residues, so I don't think the structure is stuck in the wrong 
place.  The Fo-Fc map is essentially featureless.  The 2Fo-Fc map doesn't look 
as good as it should - for instance, there are very few water molecules to be 
found.  The data reduction statistics look OK, the resolution cutoff is pretty 
conservative.  There is one molecule in the asymmetric unit, so no NCS.  There 
is no twinning either.

It seemed to me that the R is too low, not Rf too high.  More normally, R ends 
up about .18 - .20 for a data set at this resolution.

I reprocessed the images with a different data processing program and redid the 
MR. The data reduction statistics look similar, the resolution is the same, but 
now the structure refines to R = 0.20, Rf = 0.24 (same free R set of 
reflections chosen, still refmac/TLS.)  The maps look more normal. Further 
rebuilding took us to R = 0.18, Rf = 0.22

So, the question I have (and that I've been asked by the student and PI) is:  
What was the problem with the original data set?  What should I be looking for 
in the data reduction log files, for instance, or in the refinement log?  The 
large R - free R spread is characteristic of overfitting, but the geometry is 
not too loose (rmsd bonds = 0.14), there are plenty of reflections (both 
working and free).

Can anyone point me toward a reason R would be low?

Thanks

Sue


Dr. Sue A. Roberts
Dept. of Chemistry and Biochemistry
University of Arizona
1041 E. Lowell St.,  Tucson, AZ 85721
Phone: 520 621 8171 or 520 621 4168
s...@email.arizona.edu
http://www.cbc.arizona.edu/xray or 
http://www.cbc.arizona.edu/facilities/x-ray_diffraction

 


Re: [ccp4bb] R too low?

2013-06-26 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Sue,

if you made your rmsd (bonds) 20-30 times smaller I would agree they
were not too loose. 0.14A is pretty high. So two suggestions:
a) check the molprobity report of your PDB if its geometry is sane
b) check the CC plot of one data set against the other one to check if
the problem  is due to two different data or due to the PDB file (xprep
can do this plot conveniently).

Did you check if you converted the data twice to amplitudes, or maybe
not at all?

Best,
Tim

On 06/26/2013 05:44 PM, Roberts, Sue A - (suer) wrote:
 Hello Everyone
 
 I have two data sets, from the same crystal form (space group P32)
 of the same protein, collected at 100 K at SSRL, about 2.2 A
 resolution, that refining to R = 0.14, Rf = 0.26 (refmac/TLS).
 This is a molecular replacement solution, from a model with about
 40% homology (after MR density was apparent for some missing or
 misbuilt residues, so I don't think the structure is stuck in the
 wrong place.  The Fo-Fc map is essentially featureless.  The 2Fo-Fc
 map doesn't look as good as it should - for instance, there are
 very few water molecules to be found.  The data reduction
 statistics look OK, the resolution cutoff is pretty conservative.
 There is one molecule in the asymmetric unit, so no NCS.  There is
 no twinning either.
 
 It seemed to me that the R is too low, not Rf too high.  More 
 normally, R ends up about .18 - .20 for a data set at this 
 resolution.
 
 I reprocessed the images with a different data processing program
 and redid the MR. The data reduction statistics look similar, the 
 resolution is the same, but now the structure refines to R = 0.20,
 Rf = 0.24 (same free R set of reflections chosen, still
 refmac/TLS.) The maps look more normal. Further rebuilding took us
 to R = 0.18, Rf = 0.22
 
 So, the question I have (and that I've been asked by the student
 and PI) is:  What was the problem with the original data set?
 What should I be looking for in the data reduction log files, for 
 instance, or in the refinement log?  The large R - free R spread
 is characteristic of overfitting, but the geometry is not too
 loose (rmsd bonds = 0.14), there are plenty of reflections (both
 working and free).
 
 Can anyone point me toward a reason R would be low?
 
 Thanks
 
 Sue
 
 
 Dr. Sue A. Roberts Dept. of Chemistry and Biochemistry University
 of Arizona 1041 E. Lowell St.,  Tucson, AZ 85721 Phone: 520 621
 8171 or 520 621 4168 s...@email.arizona.edu
 http://www.cbc.arizona.edu/xray or
 http://www.cbc.arizona.edu/facilities/x-ray_diffraction
 
 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRyw6vUxlJ7aRr7hoRAq4HAKCJJf+FfRVT7u3UOrty0vTOFMN+mgCgtHz8
MYe+23hH+MKy/7E/h2w25+Q=
=WAsD
-END PGP SIGNATURE-


Re: [ccp4bb] R too low?

2013-06-26 Thread Robbie Joosten
HI Sue,

Can you give rmsZ for the bond and angles (from the Refmac output)? I never 
could figure these rmsd values out...
I'm guessing that the restraint are too loose, or at least not optimal. 
Perhaps, they went overboard with the TLS as well (sometimes fewer TLS goups 
give much better R and R-free values). I'm not sure anything in particular is 
wrong with the data processing. They should optimize the restraint weights in 
refinement first. In this case tighter B-factor restraint weights might do the 
trick. 

Gratuitous plug: throw the model and data into PDB_REDO (which uses Refmac too) 
and see if it gives better refinement results. 

Cheers,
Robbie



 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Roberts, Sue A - (suer)
 Sent: Wednesday, June 26, 2013 17:45
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: [ccp4bb] R too low?
 
 Hello Everyone
 
 I have two data sets, from the same crystal form (space group P32) of the
 same protein, collected at 100 K at SSRL, about 2.2 A resolution, that 
 refining
 to R = 0.14, Rf = 0.26 (refmac/TLS).  This is a molecular replacement 
 solution,
 from a model with about 40% homology (after MR density was apparent for
 some missing or misbuilt residues, so I don't think the structure is stuck in 
 the
 wrong place.  The Fo-Fc map is essentially featureless.  The 2Fo-Fc map
 doesn't look as good as it should - for instance, there are very few water
 molecules to be found.  The data reduction statistics look OK, the resolution
 cutoff is pretty conservative.  There is one molecule in the asymmetric unit,
 so no NCS.  There is no twinning either.
 
 It seemed to me that the R is too low, not Rf too high.  More normally, R ends
 up about .18 - .20 for a data set at this resolution.
 
 I reprocessed the images with a different data processing program and redid
 the MR. The data reduction statistics look similar, the resolution is the 
 same,
 but now the structure refines to R = 0.20, Rf = 0.24 (same free R set of
 reflections chosen, still refmac/TLS.)  The maps look more normal. Further
 rebuilding took us to R = 0.18, Rf = 0.22
 
 So, the question I have (and that I've been asked by the student and PI) is:
 What was the problem with the original data set?  What should I be looking
 for in the data reduction log files, for instance, or in the refinement log?  
 The
 large R - free R spread is characteristic of overfitting, but the geometry is 
 not
 too loose (rmsd bonds = 0.14), there are plenty of reflections (both working
 and free).
 
 Can anyone point me toward a reason R would be low?
 
 Thanks
 
 Sue
 
 
 Dr. Sue A. Roberts
 Dept. of Chemistry and Biochemistry
 University of Arizona
 1041 E. Lowell St.,  Tucson, AZ 85721
 Phone: 520 621 8171 or 520 621 4168
 s...@email.arizona.edu
 http://www.cbc.arizona.edu/xray or
 http://www.cbc.arizona.edu/facilities/x-ray_diffraction