Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-24 Thread Keller, Jacob
(Off-list)

Dear Kay and Harry,

You mentioned difficult datasets being of interest for the ACA conference. I am 
not coming, but do have some interesting datasets, viz., many datasets of a 
particular crystal form of a calmodulin/peptide complex which have defied my 
[not exhaustive] attempts to solve them. I have a closely-related complex 
structure which was fairly easy to solve, so I am not so invested in working 
super hard at the difficult one, but all the same, would really like to see 
what it looks like. The crystals have calcium and sulfur atoms (some collected 
at 1.6, some at 1.0 Ang) and there is also the plethora of possible MR models 
in the pdb and my closely-related complex, in which calmodulin looks dissimilar 
from the other pdb cam structures, based on DALI. I think these particular 
datasets may be tetartohedrally twinned, but from the literature it seems this 
can be overcome by tweaking the right MR model, maybe using some anomalous, etc.

Interested? I would give you all the images, I guess, which would be ~10 sets 
of varying number of frames? Radiation damage appears to be of variable degree.

Jacob



-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kay 
Diederichs
Sent: Sunday, February 23, 2014 2:55 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] High Rwork/Rfree vs. Resolution

Projects and problems like this are clearly a justification for asking to 
deposit not only the results from data processing, but also the raw data 
frames. These would allow developers to improve the models underlying their 
algorithms, and to find those corner cases where the algorithms break. That 
would help everyone.

Maybe you could make this dataset (and sequence) available for the forthcoming 
IUCr conference, as an example for a difficult dataset? (send email to Ed 
Collins or me) You would profit from the fact that experienced 
crystallographers do their best to make the most of your data.

best,

Kay

On Fri, 21 Feb 2014 20:13:33 -0600, Chris Fage  wrote:

>Thanks for the assistance, everyone.
>
>For those who suggested XDS: I forgot to mention that I have tried 
>Mosfim, which is also better than spot fitting than HKL2000. How does 
>XDS compare to Mosflm in this regard?
>
>I am not refining the high R-factor structure with NCS options. Also, 
>my unit cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there 
>isn't one particularly long axis.
>
>I'm guessing the low completeness of the 1.65 angstrom dataset has to 
>do with obstacles the processing software encountered on a sizable 
>wedge of frames (there were swaths of in red in HKL2000). I'm not sure 
>why this dataset in particular was less complete than the others.
>
>Thanks,
>Chris
>
>
>On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage  wrote:
>
>> Dear CCP4BB Users,
>>
>> I recently collected a number of datasets from plate-shaped crystals 
>> that diffracted to 1.9-2.0 angstroms and yielded very nice electron 
>> density maps. There is no major density unaccounted for by the model; 
>> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and 
>> ~0.30, respectively. Probably due to the more 2-dimensional nature of 
>> my crystals, there is a range of phi angles in which the reflections 
>> are smeared, and I am wondering if the problem lies therein.
>>
>> I would be grateful if anyone could provide advice for improving my 
>> refinement statistics, as I was under the impression that the 
>> R-factors should be ~5% lower for the given resolution.
>>
>> A few more pieces of information:
>> -Space group = P21, with 2 monomers per asymmetric unit; -Chi square 
>> = 1.0-1.5; -Rmerge = 0.10-0.15; -Data were processed in HKL2000 and 
>> refined in Refmac5 and/or phenix.refine; -PHENIX Xtriage does not 
>> detect twinning, but hints at possible weak translational 
>> pseudosymmetry; -I was previously able to grow one atypically thick 
>> crystal which diffracted to 1.65 angstroms with Rwork/Rfree at 
>> 0.18/0.22.
>> Unfortunately, the completeness of the dataset was only ~90%.
>>
>> Regards,
>> Chris
>>
>


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-24 Thread Boaz Shaanan



Jacob is right, there definitely seem to be problems with the data. Perhaps you and your supervisor should consider contacting privately the developers of  data processing programs
 that have participated in the thread like Kay and Harry (and others perhaps too) to try and get the best out of your data. There is a limit to what refinement programs can do when there is a real problem in the data which is not taken care of properly.


My 2p thoughts.


         Boaz


 
 
Boaz Shaanan, Ph.D.

Dept. of Life Sciences  
Ben-Gurion University of the Negev  
Beer-Sheva 84105    
Israel  
    
E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan  
Fax:   972-8-647-2992 or 972-8-646-1710
 
 








From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Chris Fage [cdf...@gmail.com]
Sent: Monday, February 24, 2014 12:52 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] High Rwork/Rfree vs. Resolution





Thanks again for the advice, everyone.

As suggested, I tried NCS and TLS in phenix.refine, although my R-factors did not budge.

I am now giving PDB_REDO and simulating annealing in PHENIX a shot. I am also looking into setting up XDS.

Forgive my ignorance, but I am not sure how to check whether the bulk solvent model is reasonable.

For these crystals, HKL2000 does invariably report high mosaicity along one axis (it is in the "red").

Yes, the structure was solved by MR. For the 1.65-angstrom map, the model is very complete, with density missing only for the N-terminal 6xHis tag and first three residues, as well as 5-10 other residues on flexible loops (the protein is ~300 residues, including
 the tag). Most side chains are well resolved. The quality of the 1.90-angstrom map is lower, with more gaps, more noise, and less side-chain coverage. In each map, there is no remaining density that legitimately needs to be filled.


I have attached representative frames and relevant details from the HKL2000 scale logs. (Note that the 1.65-A set was originally scaled to 1.53 A.)


As for making the datasets available before publication, I would have to check with my supervisor. The idea might not fly with him, as the structure is expected to be of relatively high impact.


Best,
Chris






On Sat, Feb 22, 2014 at 3:00 AM, Francis Reyes 
<francis.re...@colorado.edu> wrote:


>
> I'm guessing the low completeness of the 1.65 angstrom dataset has to do with obstacles the processing software encountered on a sizable wedge of frames (there were swaths of in red in HKL2000). I'm not sure why this dataset in particular was less complete
 than the others.




This is bad. Large swaths of red circles during integration is bad. I believe (check the Denzo manual) this means overlaps and overlaps get thrown out. Thus you are getting lower completeness. Was your oscillation range too large? Crystal very
 mosaic?

However this could be because of a poor crystal orientation matrix by HKL2000 which in some cases can be alleviated by mosflm and xds. (HKL2000 is much more manual, there's a lot of buttons, which means you can shoot yourself in the foot if you are not careful).

I would be particularly interested in a resolution bin breakdown in the integration and merging statistics. (I/sig and rmerge). You might as well post the refinement statistics (r and rfree) by resolution bin as well.

You have a smallish unit cell that shoots to high resolution and getting a reasonable completion of the low resolution bins is paramount.  Post the completeness of the 20-10A bin.

Is this molecular replacement? How complete is the model? Aside from the completeness of the model, how far is it from the target?

You mentioned that some regions of your crystal had smeary spots. This is also bad, particularly if the errors are not random  (I.e anisotropic along one axis). This will confuse ML refinement. Let's see a single frame of your data.

Cheers,
F
















Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Pavel Afonine
Chris,

On Sun, Feb 23, 2014 at 2:52 PM, Chris Fage  wrote:

> As suggested, I tried NCS and TLS in phenix.refine, although my R-factors
> did not budge.
> (...)
> Forgive my ignorance, but I am not sure how to check whether the bulk
> solvent model is reasonable.
>


I figure you used phenix.refine for refinement in which case bulk-solvent
modeling and overall anisotropic scaling should be optimal within the
framework of the implemented model used to describe it (for more info see
http://journals.iucr.org/d/issues/2013/04/00/dz5273/dz5273.pdf).

You can check it by looking in refinement log file (this is the table I
always look first thing before doing anything else). There you will see
something like this:

  Resolution Compl Nwork Nfree R_work   kiso  kani kmask
43.996-14.23 96.158812 0.1175 94.939   93.450 1.00 0.148 0.288
14.224-11.37 98.0694 7 0.1211 84.547   83.471 1.00 0.151 0.295
11.321-9.092 98.90   15623 0.0967 85.623   85.288 1.00 0.148 0.300
 9.082-7.268 97.97   30533 0.1400 61.020   59.892 1.00 0.143 0.300
 7.264-5.810 98.96   60163 0.1495 54.915   53.404 1.00 0.140 0.275
 5.806-4.642 98.67  1134   127 0.1271 71.906   71.017 1.00 0.149 0.274
 4.642-3.711 97.88  2198   203 0.1310 68.399   67.585 1.00 0.156 0.273
 3.710-2.966 97.16  4129   462 0.1611 46.881   46.262 1.00 0.157 0.034
 2.966-2.371 94.27  7737   874 0.1878 25.826   25.084 1.00 0.154 0.000
 2.371-2.200 90.76  3826   418 0.1758 19.757   19.072 1.00 0.158 0.000

Pay attention to:

- completeness (second column). Ideally in all resolution bins it should be
greater than 90-95%. Smaller (especially at low resolution) completeness
may result in corrupted maps.
- kmask (last column) value in the lowest resolution bin: it should be
around 0.2-0.5 (Acta Cryst. (2002). D58, 1387-1392). This characterize
sanity of bulk-solvent model.
- if a resolution bin has outstandingly high r-factor (5th column) this may
indicate a problem.

Pavel


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Keller, Jacob
It seems to me some of the images may have multiple lattices and/or 
pseudomerohedral twinning. Are all the spots predicted during integration? What 
do the various twinning tests indicate?

JPK



From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Chris Fage
Sent: Sunday, February 23, 2014 5:52 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] High Rwork/Rfree vs. Resolution

Thanks again for the advice, everyone.

As suggested, I tried NCS and TLS in phenix.refine, although my R-factors did 
not budge.

I am now giving PDB_REDO and simulating annealing in PHENIX a shot. I am also 
looking into setting up XDS.

Forgive my ignorance, but I am not sure how to check whether the bulk solvent 
model is reasonable.

For these crystals, HKL2000 does invariably report high mosaicity along one 
axis (it is in the "red").

Yes, the structure was solved by MR. For the 1.65-angstrom map, the model is 
very complete, with density missing only for the N-terminal 6xHis tag and first 
three residues, as well as 5-10 other residues on flexible loops (the protein 
is ~300 residues, including the tag). Most side chains are well resolved. The 
quality of the 1.90-angstrom map is lower, with more gaps, more noise, and less 
side-chain coverage. In each map, there is no remaining density that 
legitimately needs to be filled.

I have attached representative frames and relevant details from the HKL2000 
scale logs. (Note that the 1.65-A set was originally scaled to 1.53 A.)
As for making the datasets available before publication, I would have to check 
with my supervisor. The idea might not fly with him, as the structure is 
expected to be of relatively high impact.
Best,
Chris


On Sat, Feb 22, 2014 at 3:00 AM, Francis Reyes 
mailto:francis.re...@colorado.edu>> wrote:

>
> I'm guessing the low completeness of the 1.65 angstrom dataset has to do with 
> obstacles the processing software encountered on a sizable wedge of frames 
> (there were swaths of in red in HKL2000). I'm not sure why this dataset in 
> particular was less complete than the others.

This is bad. Large swaths of red circles during integration is bad. I believe 
(check the Denzo manual) this means overlaps and overlaps get thrown out. Thus 
you are getting lower completeness. Was your oscillation range too large? 
Crystal very mosaic?

However this could be because of a poor crystal orientation matrix by HKL2000 
which in some cases can be alleviated by mosflm and xds. (HKL2000 is much more 
manual, there's a lot of buttons, which means you can shoot yourself in the 
foot if you are not careful).

I would be particularly interested in a resolution bin breakdown in the 
integration and merging statistics. (I/sig and rmerge). You might as well post 
the refinement statistics (r and rfree) by resolution bin as well.

You have a smallish unit cell that shoots to high resolution and getting a 
reasonable completion of the low resolution bins is paramount.  Post the 
completeness of the 20-10A bin.

Is this molecular replacement? How complete is the model? Aside from the 
completeness of the model, how far is it from the target?

You mentioned that some regions of your crystal had smeary spots. This is also 
bad, particularly if the errors are not random  (I.e anisotropic along one 
axis). This will confuse ML refinement. Let's see a single frame of your data.

Cheers,
F




Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Ethan Merritt
On Sunday, 23 February 2014 09:16:41 PM Andreas Förster wrote:
> On 22/02/2014 10:15, Mark van Raaij wrote:

> > But I would really want to make a general comment - not ALL structures
> > can be better than the average!
 
> Except structures from the Lake Wobegon Center for Structural Biology, 
> of course.

Ah, but it _is_ be possible for each new structure deposition to be
better than the average quality of all previously deposited structures.

And in fact the continual improvement of detectors, programs, and
refinement protocols pushes things in exactly this direction.

I have noted only half in jest that this phenomenon is important
to the wide acceptance of validation tools like Molprobity.
By reporting quality relative to all previous structures in the PDB,
the program authors have cleverly arranged for the program to
report to most users "Green light! Your new model is better than
most structures in the PDB!".  Everyone likes to be patted on the
back and told they have done a good job, so they like the program
and continue to use it. This makes a "red light" score, when it
does happen, stand out more and therefore makes it more likely that
users will take it seriously.


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Andreas Förster

On 22/02/2014 10:15, Mark van Raaij wrote:

As the excellent tips that you got indicate, lower R-factors can be
obtained by getting better data (better crystals, better data
collection, better data processing) or better fitting, i.e. refinement.
In this respect, I am impressed by the automatic data processing
protocols now being implemented. Also, the automatic local NCS
refinement in REFMAC seems very good for our recent structures.
But I would really want to make a general comment - not ALL structures
can be better than the average!


Except structures from the Lake Wobegon Center for Structural Biology, 
of course.


There will always be structures with 5%

higher R/Rfree than the average in the same resolution range. Sometimes
this will be due to suboptimal refinement, but sometimes it may simply
not be possible to get better crystals and better data. Better not
necessarily in term of resolution, but in terms of disorders like you
describe for your plate-shaped crystals.
What I mean is that one should make all efforts to get better crystals
and data and refine structures as well as possible, but sometimes it may
not be possible to beat the average of the pdb and one should not get
too hung up by that. These structures should also be deposited and
published.
On the other hand, these "rules" that R-factor should be a certain value
at a certain resolution, may lead to suboptimal refinement. For example
the thought "my R-factor is already better than the average" could be
counterproductive and lead people to stop refinement prematurely.
Sometimes a structure will have Rs better than the average for the
resolution, but still better refinement could lower it further and this
should then be done. I can think of an MR solution using a very
homologous model that was refined at higher resolution, structures with
high NCS, or simply certain rock-solid proteins...
Another popular one is (was?) that Rfree should always be below 30%,
while several important structures justifiably have Rfrees quite a bit
higher (others perhaps have not been refined enough).
So while comparing R/Rfree to the average of existing structures is
useful, it may not necessarily be a sign that a structure is "bad" if
your Rs are 5 % higher, not should your Rs being at or below the average
be an excuse for stopping refinement too early.
Fear that ones Rs are not low enough may even lead to certain forms of
cheating, for example not keeping the Rfree reflections truly free.

On 22 Feb 2014, at 01:41, Chris Fage wrote:


Dear CCP4BB Users,

I recently collected a number of datasets from plate-shaped crystals
that diffracted to 1.9-2.0 angstroms and yielded very nice electron
density maps. There is no major density unaccounted for by the model;
however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
~0.30, respectively. Probably due to the more 2-dimensional nature of
my crystals, there is a range of phi angles in which the reflections
are smeared, and I am wondering if the problem lies therein.

I would be grateful if anyone could provide advice for improving my
refinement statistics, as I was under the impression that the
R-factors should be ~5% lower for the given resolution.

A few more pieces of information:
-Space group = P21, with 2 monomers per asymmetric unit;
-Chi square = 1.0-1.5;
-Rmerge = 0.10-0.15;
-Data were processed in HKL2000 and refined in Refmac5 and/or
phenix.refine;
-PHENIX Xtriage does not detect twinning, but hints at possible weak
translational pseudosymmetry;
-I was previously able to grow one atypically thick crystal which
diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
Unfortunately, the completeness of the dataset was only ~90%.

Regards,
Chris


Mark J van Raaij
Lab 20B
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



--
  Andreas Förster
 Crystallization and X-ray Facility Manager
   Centre for Structural Biology
  Imperial College London


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Kay Diederichs
oops, thanks for correcting me!

Kay

On Sun, 23 Feb 2014 11:28:31 +, Harry Powell  
wrote:

>Hi
>
>Kay means, of course, the ACA meeting in Albuquerque, not the IUCr in Montreal!
>


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-23 Thread Harry Powell
Hi

Kay means, of course, the ACA meeting in Albuquerque, not the IUCr in Montreal!

Authors of the major processing packages will be competing for your attention...

Harry
--
** note change of address **
Dr Harry Powell, MRC Laboratory of Molecular Biology, Francis Crick Avenue, 
Cambridge Biomedical Campus, Cambridge, CB2 0QH
Chairman of European Crystallographic Association SIG9 (Crystallographic 
Computing)

> On 23 Feb 2014, at 07:55, Kay Diederichs  
> wrote:
> 
> Projects and problems like this are clearly a justification for asking to 
> deposit not only the results from data processing, but also the raw data 
> frames. These would allow developers to improve the models underlying their 
> algorithms, and to find those corner cases where the algorithms break. That 
> would help everyone.
> 
> Maybe you could make this dataset (and sequence) available for the 
> forthcoming IUCr conference, as an example for a difficult dataset? (send 
> email to Ed Collins or me) You would profit from the fact that experienced 
> crystallographers do their best to make the most of your data.
> 
> best,
> 
> Kay
> 
>> On Fri, 21 Feb 2014 20:13:33 -0600, Chris Fage  wrote:
>> 
>> Thanks for the assistance, everyone.
>> 
>> For those who suggested XDS: I forgot to mention that I have tried Mosfim,
>> which is also better than spot fitting than HKL2000. How does XDS compare
>> to Mosflm in this regard?
>> 
>> I am not refining the high R-factor structure with NCS options. Also, my
>> unit cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there isn't one
>> particularly long axis.
>> 
>> I'm guessing the low completeness of the 1.65 angstrom dataset has to do
>> with obstacles the processing software encountered on a sizable wedge of
>> frames (there were swaths of in red in HKL2000). I'm not sure why this
>> dataset in particular was less complete than the others.
>> 
>> Thanks,
>> Chris
>> 
>> 
>>> On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage  wrote:
>>> 
>>> Dear CCP4BB Users,
>>> 
>>> I recently collected a number of datasets from plate-shaped crystals
>>> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
>>> density maps. There is no major density unaccounted for by the model;
>>> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
>>> ~0.30, respectively. Probably due to the more 2-dimensional nature of
>>> my crystals, there is a range of phi angles in which the reflections
>>> are smeared, and I am wondering if the problem lies therein.
>>> 
>>> I would be grateful if anyone could provide advice for improving my
>>> refinement statistics, as I was under the impression that the
>>> R-factors should be ~5% lower for the given resolution.
>>> 
>>> A few more pieces of information:
>>> -Space group = P21, with 2 monomers per asymmetric unit;
>>> -Chi square = 1.0-1.5;
>>> -Rmerge = 0.10-0.15;
>>> -Data were processed in HKL2000 and refined in Refmac5 and/or
>>> phenix.refine;
>>> -PHENIX Xtriage does not detect twinning, but hints at possible weak
>>> translational pseudosymmetry;
>>> -I was previously able to grow one atypically thick crystal which
>>> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
>>> Unfortunately, the completeness of the dataset was only ~90%.
>>> 
>>> Regards,
>>> Chris
>> 


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-22 Thread Kay Diederichs
Projects and problems like this are clearly a justification for asking to 
deposit not only the results from data processing, but also the raw data 
frames. These would allow developers to improve the models underlying their 
algorithms, and to find those corner cases where the algorithms break. That 
would help everyone.

Maybe you could make this dataset (and sequence) available for the forthcoming 
IUCr conference, as an example for a difficult dataset? (send email to Ed 
Collins or me) You would profit from the fact that experienced 
crystallographers do their best to make the most of your data.

best,

Kay

On Fri, 21 Feb 2014 20:13:33 -0600, Chris Fage  wrote:

>Thanks for the assistance, everyone.
>
>For those who suggested XDS: I forgot to mention that I have tried Mosfim,
>which is also better than spot fitting than HKL2000. How does XDS compare
>to Mosflm in this regard?
>
>I am not refining the high R-factor structure with NCS options. Also, my
>unit cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there isn't one
>particularly long axis.
>
>I'm guessing the low completeness of the 1.65 angstrom dataset has to do
>with obstacles the processing software encountered on a sizable wedge of
>frames (there were swaths of in red in HKL2000). I'm not sure why this
>dataset in particular was less complete than the others.
>
>Thanks,
>Chris
>
>
>On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage  wrote:
>
>> Dear CCP4BB Users,
>>
>> I recently collected a number of datasets from plate-shaped crystals
>> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
>> density maps. There is no major density unaccounted for by the model;
>> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
>> ~0.30, respectively. Probably due to the more 2-dimensional nature of
>> my crystals, there is a range of phi angles in which the reflections
>> are smeared, and I am wondering if the problem lies therein.
>>
>> I would be grateful if anyone could provide advice for improving my
>> refinement statistics, as I was under the impression that the
>> R-factors should be ~5% lower for the given resolution.
>>
>> A few more pieces of information:
>> -Space group = P21, with 2 monomers per asymmetric unit;
>> -Chi square = 1.0-1.5;
>> -Rmerge = 0.10-0.15;
>> -Data were processed in HKL2000 and refined in Refmac5 and/or
>> phenix.refine;
>> -PHENIX Xtriage does not detect twinning, but hints at possible weak
>> translational pseudosymmetry;
>> -I was previously able to grow one atypically thick crystal which
>> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
>> Unfortunately, the completeness of the dataset was only ~90%.
>>
>> Regards,
>> Chris
>>
>


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-22 Thread Kay Diederichs
I agree with the general points that Mark makes. For the basic problem to find 
out whether the model quality (and R-values) is limited by the data quality, 
you can compare CCwork with CC* - if CCwork is significantly lower than CC*, 
then this means that either your model could in principle be improved (you have 
not yet found the right parameterization), or the data suffers from correlated 
systematic errors. So it would be good to try the XDS data processing.

best,

Kay


On Sat, 22 Feb 2014 23:15:33 +0100, Mark van Raaij  
wrote:

>As the excellent tips that you got indicate, lower R-factors can be obtained 
>by getting better data (better crystals, better data collection, better data 
>processing) or better fitting, i.e. refinement. In this respect, I am 
>impressed by the automatic data processing protocols now being implemented. 
>Also, the automatic local NCS refinement in REFMAC seems very good for our 
>recent structures.
>But I would really want to make a general comment - not ALL structures can be 
>better than the average! There will always be structures with 5% higher 
>R/Rfree than the average in the same resolution range. Sometimes this will be 
>due to suboptimal refinement, but sometimes it may simply not be possible to 
>get better crystals and better data. Better not necessarily in term of 
>resolution, but in terms of disorders like you describe for your plate-shaped 
>crystals.
>What I mean is that one should make all efforts to get better crystals and 
>data and refine structures as well as possible, but sometimes it may not be 
>possible to beat the average of the pdb and one should not get too hung up by 
>that. These structures should also be deposited and published.
>On the other hand, these "rules" that R-factor should be a certain value at a 
>certain resolution, may lead to suboptimal refinement. For example the thought 
>"my R-factor is already better than the average" could be counterproductive 
>and lead people to stop refinement prematurely.
>Sometimes a structure will have Rs better than the average for the resolution, 
>but still better refinement could lower it further and this should then be 
>done. I can think of an MR solution using a very homologous model that was 
>refined at higher resolution, structures with high NCS, or simply certain 
>rock-solid proteins...
>Another popular one is (was?) that Rfree should always be below 30%, while 
>several important structures justifiably have Rfrees quite a bit higher 
>(others perhaps have not been refined enough).
>So while comparing R/Rfree to the average of existing structures is useful, it 
>may not necessarily be a sign that a structure is "bad" if your Rs are 5 % 
>higher, not should your Rs being at or below the average be an excuse for 
>stopping refinement too early.
>Fear that ones Rs are not low enough may even lead to certain forms of 
>cheating, for example not keeping the Rfree reflections truly free.
>
>On 22 Feb 2014, at 01:41, Chris Fage wrote:
>
>> Dear CCP4BB Users,
>> 
>> I recently collected a number of datasets from plate-shaped crystals
>> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
>> density maps. There is no major density unaccounted for by the model;
>> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
>> ~0.30, respectively. Probably due to the more 2-dimensional nature of
>> my crystals, there is a range of phi angles in which the reflections
>> are smeared, and I am wondering if the problem lies therein.
>> 
>> I would be grateful if anyone could provide advice for improving my
>> refinement statistics, as I was under the impression that the
>> R-factors should be ~5% lower for the given resolution.
>> 
>> A few more pieces of information:
>> -Space group = P21, with 2 monomers per asymmetric unit;
>> -Chi square = 1.0-1.5;
>> -Rmerge = 0.10-0.15;
>> -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine;
>> -PHENIX Xtriage does not detect twinning, but hints at possible weak
>> translational pseudosymmetry;
>> -I was previously able to grow one atypically thick crystal which
>> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
>> Unfortunately, the completeness of the dataset was only ~90%.
>> 
>> Regards,
>> Chris
>
>Mark J van Raaij
>Lab 20B
>Dpto de Estructura de Macromoleculas
>Centro Nacional de Biotecnologia - CSIC
>c/Darwin 3
>E-28049 Madrid, Spain
>tel. (+34) 91 585 4616
>http://www.cnb.csic.es/~mjvanraaij
>
>


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-22 Thread Mark van Raaij
As the excellent tips that you got indicate, lower R-factors can be obtained by 
getting better data (better crystals, better data collection, better data 
processing) or better fitting, i.e. refinement. In this respect, I am impressed 
by the automatic data processing protocols now being implemented. Also, the 
automatic local NCS refinement in REFMAC seems very good for our recent 
structures.
But I would really want to make a general comment - not ALL structures can be 
better than the average! There will always be structures with 5% higher R/Rfree 
than the average in the same resolution range. Sometimes this will be due to 
suboptimal refinement, but sometimes it may simply not be possible to get 
better crystals and better data. Better not necessarily in term of resolution, 
but in terms of disorders like you describe for your plate-shaped crystals.
What I mean is that one should make all efforts to get better crystals and data 
and refine structures as well as possible, but sometimes it may not be possible 
to beat the average of the pdb and one should not get too hung up by that. 
These structures should also be deposited and published.
On the other hand, these "rules" that R-factor should be a certain value at a 
certain resolution, may lead to suboptimal refinement. For example the thought 
"my R-factor is already better than the average" could be counterproductive and 
lead people to stop refinement prematurely.
Sometimes a structure will have Rs better than the average for the resolution, 
but still better refinement could lower it further and this should then be 
done. I can think of an MR solution using a very homologous model that was 
refined at higher resolution, structures with high NCS, or simply certain 
rock-solid proteins...
Another popular one is (was?) that Rfree should always be below 30%, while 
several important structures justifiably have Rfrees quite a bit higher (others 
perhaps have not been refined enough).
So while comparing R/Rfree to the average of existing structures is useful, it 
may not necessarily be a sign that a structure is "bad" if your Rs are 5 % 
higher, not should your Rs being at or below the average be an excuse for 
stopping refinement too early.
Fear that ones Rs are not low enough may even lead to certain forms of 
cheating, for example not keeping the Rfree reflections truly free.

On 22 Feb 2014, at 01:41, Chris Fage wrote:

> Dear CCP4BB Users,
> 
> I recently collected a number of datasets from plate-shaped crystals
> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
> density maps. There is no major density unaccounted for by the model;
> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
> ~0.30, respectively. Probably due to the more 2-dimensional nature of
> my crystals, there is a range of phi angles in which the reflections
> are smeared, and I am wondering if the problem lies therein.
> 
> I would be grateful if anyone could provide advice for improving my
> refinement statistics, as I was under the impression that the
> R-factors should be ~5% lower for the given resolution.
> 
> A few more pieces of information:
> -Space group = P21, with 2 monomers per asymmetric unit;
> -Chi square = 1.0-1.5;
> -Rmerge = 0.10-0.15;
> -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine;
> -PHENIX Xtriage does not detect twinning, but hints at possible weak
> translational pseudosymmetry;
> -I was previously able to grow one atypically thick crystal which
> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
> Unfortunately, the completeness of the dataset was only ~90%.
> 
> Regards,
> Chris

Mark J van Raaij
Lab 20B
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-22 Thread Francis Reyes
> 
> I'm guessing the low completeness of the 1.65 angstrom dataset has to do with 
> obstacles the processing software encountered on a sizable wedge of frames 
> (there were swaths of in red in HKL2000). I'm not sure why this dataset in 
> particular was less complete than the others.


This is bad. Large swaths of red circles during integration is bad. I believe 
(check the Denzo manual) this means overlaps and overlaps get thrown out. Thus 
you are getting lower completeness. Was your oscillation range too large? 
Crystal very mosaic?

However this could be because of a poor crystal orientation matrix by HKL2000 
which in some cases can be alleviated by mosflm and xds. (HKL2000 is much more 
manual, there's a lot of buttons, which means you can shoot yourself in the 
foot if you are not careful).

I would be particularly interested in a resolution bin breakdown in the 
integration and merging statistics. (I/sig and rmerge). You might as well post 
the refinement statistics (r and rfree) by resolution bin as well.

You have a smallish unit cell that shoots to high resolution and getting a 
reasonable completion of the low resolution bins is paramount.  Post the 
completeness of the 20-10A bin. 

Is this molecular replacement? How complete is the model? Aside from the 
completeness of the model, how far is it from the target?

You mentioned that some regions of your crystal had smeary spots. This is also 
bad, particularly if the errors are not random  (I.e anisotropic along one 
axis). This will confuse ML refinement. Let's see a single frame of your data.

Cheers,
F


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-21 Thread Anastasia's Perrakis
I can't help but suggest to also try PDB_REDO for tuning refinement. 

http://xtal.nki.nl/PDB_REDO/index.jsp

One of the things you get, is exactly what Pavel explains below, how your 
structure looks in comparison with others in similar resolution, but also with 
the PDB_REDO data bank structures. 

I will also agree that the 90% complete data set might be best - just refine 
both models, compare, and choose the best one according to validation criteria 
(you get a few for free in PDB_REDO and of course in eg Molprobity)

Tassos

Sent from my iPad

> On 22 Feb 2014, at 02:20, Pavel Afonine  wrote:
> 
> Chris,
> 
> what you get is not unheard of but clearly you are not in majority: at around 
> 1.95A resolution distribution of R-factors in PDB is:
> 
> Histogram of Rwork for models in PDB at resolution 1.85-2.05 A:
>  0.093 - 0.118  : 3
>  0.118 - 0.143  : 75
>  0.143 - 0.168  : 821
>  0.168 - 0.193  : 2617
>  0.193 - 0.218  : 2950
>  0.218 - 0.242  : 1147
>  0.242 - 0.267  : 201  <<< your case
>  0.267 - 0.292  : 21
>  0.292 - 0.317  : 2
>  0.317 - 0.342  : 1
> Histogram of Rfree for models in PDB at resolution 1.85-2.05 A:
>  0.138 - 0.160  : 12
>  0.160 - 0.183  : 106
>  0.183 - 0.205  : 742
>  0.205 - 0.227  : 1971
>  0.227 - 0.249  : 2566
>  0.249 - 0.272  : 1676
>  0.272 - 0.294  : 616
>  0.294 - 0.316  : 119   <<< your case
>  0.316 - 0.339  : 24
>  0.339 - 0.361  : 6
> Histogram of Rfree-Rwork for all model in PDB at resolution 1.85-2.05 A:
>  0.001 - 0.011  : 67
>  0.011 - 0.021  : 428
>  0.021 - 0.031  : 1324
>  0.031 - 0.041  : 2220
>  0.041 - 0.050  : 1975
>  0.050 - 0.060  : 1059  <<< your case
>  0.060 - 0.070  : 459
>  0.070 - 0.080  : 201
>  0.080 - 0.090  : 75
>  0.090 - 0.100  : 30
> 
> Pavel
> 
> P.S.: Command to the statistics as above is:
> phenix.r_factor_statistics 1.95
> 
> 
>> On Fri, Feb 21, 2014 at 4:41 PM, Chris Fage  wrote:
>> Dear CCP4BB Users,
>> 
>> I recently collected a number of datasets from plate-shaped crystals
>> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
>> density maps. There is no major density unaccounted for by the model;
>> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
>> ~0.30, respectively. Probably due to the more 2-dimensional nature of
>> my crystals, there is a range of phi angles in which the reflections
>> are smeared, and I am wondering if the problem lies therein.
>> 
>> I would be grateful if anyone could provide advice for improving my
>> refinement statistics, as I was under the impression that the
>> R-factors should be ~5% lower for the given resolution.
>> 
>> A few more pieces of information:
>> -Space group = P21, with 2 monomers per asymmetric unit;
>> -Chi square = 1.0-1.5;
>> -Rmerge = 0.10-0.15;
>> -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine;
>> -PHENIX Xtriage does not detect twinning, but hints at possible weak
>> translational pseudosymmetry;
>> -I was previously able to grow one atypically thick crystal which
>> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
>> Unfortunately, the completeness of the dataset was only ~90%.
>> 
>> Regards,
>> Chris
> 


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-21 Thread Axel Brunger
Chris,

First, I would try NCS restraints even at ~ 2 A.

Second, any outliers in your diffraction data set that might skew the R values?

Third, have you checked that your refined bulk solvent model is reasonable? 

Axel




On Feb 21, 2014, at 6:13 PM, Chris Fage  wrote:

> Thanks for the assistance, everyone.
> 
> For those who suggested XDS: I forgot to mention that I have tried Mosfim, 
> which is also better than spot fitting than HKL2000. How does XDS compare to 
> Mosflm in this regard?
> 
> I am not refining the high R-factor structure with NCS options. Also, my unit 
> cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there isn't one 
> particularly long axis. 
> 
> I'm guessing the low completeness of the 1.65 angstrom dataset has to do with 
> obstacles the processing software encountered on a sizable wedge of frames 
> (there were swaths of in red in HKL2000). I'm not sure why this dataset in 
> particular was less complete than the others.
> 
> Thanks,
> Chris
> 
> 
> On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage  wrote:
> Dear CCP4BB Users,
> 
> I recently collected a number of datasets from plate-shaped crystals
> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
> density maps. There is no major density unaccounted for by the model;
> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
> ~0.30, respectively. Probably due to the more 2-dimensional nature of
> my crystals, there is a range of phi angles in which the reflections
> are smeared, and I am wondering if the problem lies therein.
> 
> I would be grateful if anyone could provide advice for improving my
> refinement statistics, as I was under the impression that the
> R-factors should be ~5% lower for the given resolution.
> 
> A few more pieces of information:
> -Space group = P21, with 2 monomers per asymmetric unit;
> -Chi square = 1.0-1.5;
> -Rmerge = 0.10-0.15;
> -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine;
> -PHENIX Xtriage does not detect twinning, but hints at possible weak
> translational pseudosymmetry;
> -I was previously able to grow one atypically thick crystal which
> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
> Unfortunately, the completeness of the dataset was only ~90%.
> 
> Regards,
> Chris
> 



Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-21 Thread Jens Kaiser
Hi Chris,
  I personally would go with your "thick" dataset. 90% completeness is
not stellar, but in my opinion not detrimental, either. 
  I had one project that persistently yielded crystals that diffracted
to rather high resolution (2.3), but in one direction no lunes were
discernible and - consistent with that -  the other direction's
diffraction consisted of lines that had little beads on them - i.e.
extremely smeary spots. XDS was the only program to integrate this data
at an Rmerge better than 25% (it actually got below 10%).
  I was able to phase this data experimentally (Fe-MAD), use NCS and end
up with amazing maps. Nevertheless, refinement was a bitch: It never
went significantly below 30 for Rfree and messed up the geometry of the
model, even though the electron density was clearly showing where the
model should be. My explanation for this was that this was a rare case
were the phases were actually determined better than the Fs. If you look
back, in the days before refinement, reflection intensities were not
measured, they were classified as weak, medium and strong - and that was
enough to generate meaningful electron densities. 
  In a cases like that, were the accurate determination of integrated
intensities is a a problem, there should be a mechanism to submit
experimental electron density instead of refined models, as the latter
will make way less sense.
  So again - you got lucky with your "thick" dataset -- use it and don't
sweat the 90% completeness!

HTH,

Jens

On Fri, 2014-02-21 at 18:41 -0600, Chris Fage wrote:
> Dear CCP4BB Users,
> 
> I recently collected a number of datasets from plate-shaped crystals
> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
> density maps. There is no major density unaccounted for by the model;
> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
> ~0.30, respectively. Probably due to the more 2-dimensional nature of
> my crystals, there is a range of phi angles in which the reflections
> are smeared, and I am wondering if the problem lies therein.
> 
> I would be grateful if anyone could provide advice for improving my
> refinement statistics, as I was under the impression that the
> R-factors should be ~5% lower for the given resolution.
> 
> A few more pieces of information:
> -Space group = P21, with 2 monomers per asymmetric unit;
> -Chi square = 1.0-1.5;
> -Rmerge = 0.10-0.15;
> -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine;
> -PHENIX Xtriage does not detect twinning, but hints at possible weak
> translational pseudosymmetry;
> -I was previously able to grow one atypically thick crystal which
> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
> Unfortunately, the completeness of the dataset was only ~90%.
> 
> Regards,
> Chris


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-21 Thread Chris Fage
Thanks for the assistance, everyone.

For those who suggested XDS: I forgot to mention that I have tried Mosfim,
which is also better than spot fitting than HKL2000. How does XDS compare
to Mosflm in this regard?

I am not refining the high R-factor structure with NCS options. Also, my
unit cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there isn't one
particularly long axis.

I'm guessing the low completeness of the 1.65 angstrom dataset has to do
with obstacles the processing software encountered on a sizable wedge of
frames (there were swaths of in red in HKL2000). I'm not sure why this
dataset in particular was less complete than the others.

Thanks,
Chris


On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage  wrote:

> Dear CCP4BB Users,
>
> I recently collected a number of datasets from plate-shaped crystals
> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
> density maps. There is no major density unaccounted for by the model;
> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
> ~0.30, respectively. Probably due to the more 2-dimensional nature of
> my crystals, there is a range of phi angles in which the reflections
> are smeared, and I am wondering if the problem lies therein.
>
> I would be grateful if anyone could provide advice for improving my
> refinement statistics, as I was under the impression that the
> R-factors should be ~5% lower for the given resolution.
>
> A few more pieces of information:
> -Space group = P21, with 2 monomers per asymmetric unit;
> -Chi square = 1.0-1.5;
> -Rmerge = 0.10-0.15;
> -Data were processed in HKL2000 and refined in Refmac5 and/or
> phenix.refine;
> -PHENIX Xtriage does not detect twinning, but hints at possible weak
> translational pseudosymmetry;
> -I was previously able to grow one atypically thick crystal which
> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
> Unfortunately, the completeness of the dataset was only ~90%.
>
> Regards,
> Chris
>


Re: [ccp4bb] High Rwork/Rfree vs. Resolution

2014-02-21 Thread Pavel Afonine
Chris,

what you get is not unheard of but clearly you are not in majority: at
around 1.95A resolution distribution of R-factors in PDB is:

Histogram of Rwork for models in PDB at resolution 1.85-2.05 A:
 0.093 - 0.118  : 3
 0.118 - 0.143  : 75
 0.143 - 0.168  : 821
 0.168 - 0.193  : 2617
 0.193 - 0.218  : 2950
 0.218 - 0.242  : 1147
 0.242 - 0.267  : 201  <<< your case
 0.267 - 0.292  : 21
 0.292 - 0.317  : 2
 0.317 - 0.342  : 1
Histogram of Rfree for models in PDB at resolution 1.85-2.05 A:
 0.138 - 0.160  : 12
 0.160 - 0.183  : 106
 0.183 - 0.205  : 742
 0.205 - 0.227  : 1971
 0.227 - 0.249  : 2566
 0.249 - 0.272  : 1676
 0.272 - 0.294  : 616
 0.294 - 0.316  : 119   <<< your case
 0.316 - 0.339  : 24
 0.339 - 0.361  : 6
Histogram of Rfree-Rwork for all model in PDB at resolution 1.85-2.05 A:
 0.001 - 0.011  : 67
 0.011 - 0.021  : 428
 0.021 - 0.031  : 1324
 0.031 - 0.041  : 2220
 0.041 - 0.050  : 1975
 0.050 - 0.060  : 1059  <<< your case
 0.060 - 0.070  : 459
 0.070 - 0.080  : 201
 0.080 - 0.090  : 75
 0.090 - 0.100  : 30

Pavel

P.S.: Command to the statistics as above is:
phenix.r_factor_statistics 1.95


On Fri, Feb 21, 2014 at 4:41 PM, Chris Fage  wrote:

> Dear CCP4BB Users,
>
> I recently collected a number of datasets from plate-shaped crystals
> that diffracted to 1.9-2.0 angstroms and yielded very nice electron
> density maps. There is no major density unaccounted for by the model;
> however, I am unable to decrease Rwork and Rfree beyond ~0.25 and
> ~0.30, respectively. Probably due to the more 2-dimensional nature of
> my crystals, there is a range of phi angles in which the reflections
> are smeared, and I am wondering if the problem lies therein.
>
> I would be grateful if anyone could provide advice for improving my
> refinement statistics, as I was under the impression that the
> R-factors should be ~5% lower for the given resolution.
>
> A few more pieces of information:
> -Space group = P21, with 2 monomers per asymmetric unit;
> -Chi square = 1.0-1.5;
> -Rmerge = 0.10-0.15;
> -Data were processed in HKL2000 and refined in Refmac5 and/or
> phenix.refine;
> -PHENIX Xtriage does not detect twinning, but hints at possible weak
> translational pseudosymmetry;
> -I was previously able to grow one atypically thick crystal which
> diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22.
> Unfortunately, the completeness of the dataset was only ~90%.
>
> Regards,
> Chris
>