Re: [ccp4bb] Far to good r-factors

2010-06-01 Thread Ian Tickle
On Mon, May 31, 2010 at 9:15 PM, Dale Tronrud det...@uoxray.uoregon.edu wrote:
   One of the great mysteries of refinement is that a model created using
 high resolution data will fit a low resolution data set much better than
 a model created only using the low resolution data.  It appears that there
 are many types of errors that degrade the fit to low resolution data that
 can only be identified and fixed by using the information from high
 resolution data.

Is it such a mystery?  Isn't it just a case of overfitting to the
experimental errors in the low res data if you tried to use the same
parameterization  restraint weighting as for the high res refinement?
 Consequently you are forced to use fewer parameters and/or higher
restraint weighting at low res which obviously is not going to give as
good a fit.

Cheers

-- Ian


Re: [ccp4bb] Far to good r-factors

2010-06-01 Thread Dale Tronrud
   This would be a possible explanation, and certainly is a problem with
low resolution refinements, but the free R indicates that overfitting
is not the problem here.  (I'm assuming that the proper choice of test
set has been made in this case.)  In my experience, for very isomorphous
pairs of structures, when a high resolution model is used as the starting
point for a low resolution refinement, even the R values before refinement
will be very good and that means fitting the noise can't be the cause.

   Our methods today are simply not as good at fitting low resolution data
in the absence of high resolution data as they are in its presence.

Dale Tronrud

On 06/01/10 04:51, Ian Tickle wrote:
 On Mon, May 31, 2010 at 9:15 PM, Dale Tronrud det...@uoxray.uoregon.edu 
 wrote:
   One of the great mysteries of refinement is that a model created using
 high resolution data will fit a low resolution data set much better than
 a model created only using the low resolution data.  It appears that there
 are many types of errors that degrade the fit to low resolution data that
 can only be identified and fixed by using the information from high
 resolution data.
 
 Is it such a mystery?  Isn't it just a case of overfitting to the
 experimental errors in the low res data if you tried to use the same
 parameterization  restraint weighting as for the high res refinement?
  Consequently you are forced to use fewer parameters and/or higher
 restraint weighting at low res which obviously is not going to give as
 good a fit.
 
 Cheers
 
 -- Ian


Re: [ccp4bb] Far to good r-factors

2010-05-31 Thread Gregory Bowman

Paul,

Does your lower resolution structure have the same unit cell as the  
model used for MR? If your two crystals are the same except for the  
presence of the ligand, then you need to make sure to keep the same  
Rfree set for both. Otherwise, some reflections that were previous in  
the Rwork set will now be in the Rfree set, but they are not really  
free because the model was previously refined using these reflections.


Greg

On May 30, 2010, at 8:15 AM, Paul Lindblom wrote:


Hi everybody,

once more I need your help. I solved the structure of an enzyme at  
resolution of 1.9 A. Now I was trying to get a complex and soaked  
some ligand to my crystals. I could solve the structure (and see  
poor density for my ligand or something else) at 3.0 A by molecular  
replacement using my 1.9A structure as a starting model.  But the  
problem is now, that I got an R-work of 16.34 and an r-free of  
20.23 for the new 3.0 A structure - without adding any waters or  
solvent/ligand molecules. The r-factors are even better than the  
ones I got for the 1.9A structure. So I think something is wrong  
with the whole thing. I observed twinning for both data and used  
the twin refinement option in refmac, but the results stay more or  
less the same.


Any suggestions what to do? Thanks a lot,

Paul


--
Department of Biophysics
Johns Hopkins University
302 Jenkins Hall
3400 N. Charles St.
Baltimore, MD 21218
Phone: (410) 516-7850 (office)
Phone: (410) 516-3476 (lab)
Fax: (410) 516-4118
gdbow...@jhu.edu





Re: [ccp4bb] Far to good r-factors

2010-05-31 Thread Pavel Afonine
This reminded me another thing... Did you create the free-R flags 
considering twinning? This is very important.


Pavel.


On 5/31/10 12:43 PM, Gregory Bowman wrote:

Paul,

Does your lower resolution structure have the same unit cell as the 
model used for MR? If your two crystals are the same except for the 
presence of the ligand, then you need to make sure to keep the same 
Rfree set for both. Otherwise, some reflections that were previous in 
the Rwork set will now be in the Rfree set, but they are not really 
free because the model was previously refined using these reflections. 


Greg

On May 30, 2010, at 8:15 AM, Paul Lindblom wrote:


Hi everybody,

once more I need your help. I solved the structure of an enzyme at 
resolution of 1.9 A. Now I was trying to get a complex and soaked 
some ligand to my crystals. I could solve the structure (and see poor 
density for my ligand or something else) at 3.0 A by molecular 
replacement using my 1.9A structure as a starting model.  But the 
problem is now, that I got an R-work of 16.34 and an r-free of 20.23 
for the new 3.0 A structure - without adding any waters or 
solvent/ligand molecules. The r-factors are even better than the ones 
I got for the 1.9A structure. So I think something is wrong with the 
whole thing. I observed twinning for both data and used the twin 
refinement option in refmac, but the results stay more or less the same.


Any suggestions what to do? Thanks a lot,

Paul


--
Department of Biophysics
Johns Hopkins University
302 Jenkins Hall
3400 N. Charles St.
Baltimore, MD 21218
Phone: (410) 516-7850 (office)
Phone: (410) 516-3476 (lab)
Fax: (410) 516-4118
gdbow...@jhu.edu mailto:gdbow...@jhu.edu





[ccp4bb] Far to good r-factors

2010-05-30 Thread Paul Lindblom
Hi everybody,

once more I need your help. I solved the structure of an enzyme at
resolution of 1.9 A. Now I was trying to get a complex and soaked some
ligand to my crystals. I could solve the structure (and see poor density for
my ligand or something else) at 3.0 A by molecular replacement using my 1.9A
structure as a starting model.  But the problem is now, that I got an R-work
of 16.34 and an r-free of 20.23 for the new 3.0 A structure - without adding
any waters or solvent/ligand molecules. The r-factors are even better than
the ones I got for the 1.9A structure. So I think something is wrong with
the whole thing. I observed twinning for both data and used the twin
refinement option in refmac, but the results stay more or less the same.

Any suggestions what to do? Thanks a lot,

Paul


Re: [ccp4bb] Far to good r-factors

2010-05-30 Thread Vellieux Frederic

Hi Paul,

I've seen that type of behaviour before for low resolution structures. 
On such structures,


either I have a very hard time getting at the same time a good geometry, 
good R-factors and satisfactory electron density,


or things go very smoothly and all the statistics (model geometry, 
R-factors) plus electron density are fine.


Too bad I have no way of predicting when things will be going well.

Two examples where things went very smoothly:

glyceraldehyde-phosphate dehydrogenase from Trypanosoma brucei brucei 
(PDB id: 2X0N re-refined fairly recently with Phenix);
malate dehydrogenase from Archaeoglobus fulgidus (PDB id: 2X0I also 
re-refined with Phenix)


The only thing you have to check is that the relative weighting of the 
X-ray term vs. the geometry term is appropriate, so that you do not 
lower the R-factors while the geometry is getting worse.


HTH,

Fred.

Paul Lindblom wrote:

Hi everybody,

once more I need your help. I solved the structure of an enzyme at 
resolution of 1.9 A. Now I was trying to get a complex and soaked some 
ligand to my crystals. I could solve the structure (and see poor 
density for my ligand or something else) at 3.0 A by molecular 
replacement using my 1.9A structure as a starting model.  But the 
problem is now, that I got an R-work of 16.34 and an r-free of 20.23 
for the new 3.0 A structure - without adding any waters or 
solvent/ligand molecules. The r-factors are even better than the ones 
I got for the 1.9A structure. So I think something is wrong with the 
whole thing. I observed twinning for both data and used the twin 
refinement option in refmac, but the results stay more or less the same.


Any suggestions what to do? Thanks a lot,

Paul


Re: [ccp4bb] Far to good r-factors

2010-05-30 Thread Pavel Afonine

Hi Paul,

another hypothesis...

If you take an ultra-high resolution structure from PDB (resolution 
higher than 1.0A), then cut the data at 3A and do some refinement, you 
will get unusually low R-factors.


This may suggest that your crystal can diffract to higher resolution and 
3A is not the limit.


However, you mentioned twinning and I guess it would be wise to double 
check how the R-factor is computed in this case, and how the total model 
structure factor is defined. I recall some Garib's comments about the 
difference of R-factor values related to twinning...


In addition, to make sure that what you observe is not refinement 
program dependent or how the R-factor and Fmodel are defined, I would 
try to re-compute the R-factors or do some quick refinement in another 
package. For example, does this command:


phenix.model_vs_data model.pdb data.mtz

give you the same R-factors?

Good luck!
Pavel.


On 5/30/10 5:15 AM, Paul Lindblom wrote:

Hi everybody,

once more I need your help. I solved the structure of an enzyme at 
resolution of 1.9 A. Now I was trying to get a complex and soaked some 
ligand to my crystals. I could solve the structure (and see poor 
density for my ligand or something else) at 3.0 A by molecular 
replacement using my 1.9A structure as a starting model.  But the 
problem is now, that I got an R-work of 16.34 and an r-free of 20.23 
for the new 3.0 A structure - without adding any waters or 
solvent/ligand molecules. The r-factors are even better than the ones 
I got for the 1.9A structure. So I think something is wrong with the 
whole thing. I observed twinning for both data and used the twin 
refinement option in refmac, but the results stay more or less the same.


Any suggestions what to do? Thanks a lot,

Paul