Re: [ccp4bb] Does ncs bias R-free? And if so, can it be avoided by special selection of the free set?

2019-06-01 Thread Jonathan Cooper
 I have done some more tests with different programs for choosing the R-free 
set in shells or at random and the results are at the same link:
https://www.ucl.ac.uk/~rmhajc0/rfreetests.pdf

There still seems to be no significant difference between the normal R-free and 
the R-free in shells, with up to 20-fold NCS present. I can't comment on 
twinning, but with NCS it would seem that the normal CCP4 way of picking the 
R-free set is as good as anything else! On Sunday, 26 May 2019, 14:02:50 
BST, dusan turk  wrote:  
 
 Dear colleagues,


> Does ncs bias R-free? And if so, can it be avoided by special selection of
    the free set?

It occurs to me that we tend to forget that the objective of structure 
determination is not the model with the lowest model bias, but the model which 
is closest to the true structure. The structure without model bias is the 
structure without a model - which is not really helpful.

An angle on the NCS issue is provided by the work of Silva & Rossmann (1985, 
Acta Cryst B41, 147-157), who discarded most of data almost proportionally to 
the level of NCS redundancy (using 1/7th for WORK set and 6/7 for TEST set in 
the case of 10-fold NCS). They did it in 1990s in order to make refinement of 
their large structure computationally feasible: “Despite the reduction in the 
number of variables imposed by the non-crystallographic constraints, the 
problem remained a formidable one if all 298615 crystallographically 
independent reflections were to be used in the refinement. However, the 
reduction of size of the asymmetric unit in real space should be equivalent to 
a corresponding reduction in reciprocal space. Hence, one-tenth of refinement 
of the independent data might suffice for refinement.” In conclusion they 
stated that “This is the first time that the structure of a complete virus has 
been refined by a reciprocal-space method.” To conclude, to select an 
independent data set to refined against, one should take an n-th fraction of 
reflections from the data set containing the n-fold NCS.

Now on the bias of the concept of R-free itself. As we known, each term in the 
Fourier series is orthogonal to all other terms, hence the projection of any 
two terms on each other is zero. We also know that diffraction pattern of a 
crystal structure is composed of Iobs which reflect Fobs. Fobs are a Fourier 
series of terms . From measured set of Iobs we can directly calculate |Fobs|, 
but not their phase. To calculate the phase in refinement we use Fmodel 
structure factors, of which the most significant part are Fcalc calculated from 
atomic model. However, the model is changed during model building and 
refinement (atomic positions, B-factors and occupancies), all Fmodel structure 
factors change in size and in phase angle.

During refinement using a cross validated maximum likelihood target function 
atomic model is fitted against the selected subset of |Fobs|, called WORK set, 
using a corresponding subset of Fmodel. The remaining part of structure factors 
of Fmodel, called the TEST set is used to calculate the weighted terms used in 
refinement and is based on phase error estimates. This Fmodel fraction equally 
depends on attributes of all atoms of the model. As consequence, the TEST 
fraction of Fmodel structure factors is model dependent. Now comes the catch, 
if the TEST fraction of structure factors (Fobs) was truly independent from the 
model, then it should remain so also during the refinement. As consequence and 
simultaneous proof of this independency, the R-free should not be affected by 
refinement. As we know this holds only for the incorrect structure solutions. 
Their atoms are refined in direction that do not lead towards the true 
structure. As soon as a structure solution is correct, its improvements will 
lower R-free because the model is related to the true crystal structure. This 
is in my opinion the only true value of the R-free gap criterion. The problems 
are that use of the WORK subset makes refinement to aim off the true target and 
that the use of TEST fraction for estimating phase error correctness is an 
approximation not justified by the claim of independency of the TEST set. I do 
not want to undermine the historical importance of the TEST set use for 
refinement and structure validation, however we need and can do better.

As shown by Silva & Rossman in 1985 the concept of independency of a TEST 
subset fraction of Fobs structure factors is not true for the structures 
composed of equal copies of molecules present in asymmetric unit of a crystal 
(crystals with NCS) . The same reasoning can be applied to the twinned data 
sets. However, de-twining is model dependent, hence the claim of independency 
of TEST and WORK subsets of Fobs structure factors actually fail due to 
dependency of the Fmodel WORK and TEST subsets.

The significant part of model bias originates from the use of chemical 
restraints in refinement that effect positions of 

Re: [ccp4bb] tNCS incompatible with cell dimensions

2019-06-01 Thread Jonathan Cooper
Does the SAXS model contain more than one subunit? If so, I would be tempted to 
go back to the model and try each one separately. This may not apply, but if 
there are monomers in the SAXS model that are related by space group symmetry 
in the crystal, I think the MR would never work. Good luck with it! Bests, Jon. 
Cooper

Sent from Yahoo Mail on Android 
 
  On Sat, 1 Jun 2019 at 9:45, Jrh Gmail wrote:   Dear 
KevinYou could try reindexing into P1, then run Phaser and with its solution as 
input to Zanuda determine the space group. Best wishes,John 

Emeritus Professor of Chemistry John R Helliwell DSc_Physics 



On 31 May 2019, at 21:09, Kevin Jude  wrote:


Hello community, I wonder if I could solicit advice about a problematic 
dataset. I plan to solve the structure by molecular replacement and expect that 
the protein is relatively compact, ie not elongated. SAXS data supports this 
expectation.

The crystals diffract to 2.6 Å resolution and appear to be in P 21 21 2 with a 
= 49, b = 67, c = 94, which should fit <=2 molecules in the ASU with 40% 
solvent. The native Patterson shows a large peak (12 sigma) suggesting a tNCS 
vector of {0.5, 0.5, 0}.
If you're sharper than me, you may have already spotted the problem - c is the 
long axis of the unit cell, but tNCS constrains the proteins to a plane 
parallel to the a,b plane. Indeed, molecular replacement attempts using Phaser 
will not give a solution in any orthorhombic space group unless I turn off 
packing, and then I get large overlaps in the a,b plane and huge gaps along c.
Since I believe that my model is good (or at least the correct shape, based on 
SAXS), I wonder if I'm misinterpreting my crystallographic data. Any insights 
into how to approach this problem would be much appreciated.
--
Kevin Jude, PhDStructural Biology Research Specialist, Garcia LabHoward Hughes 
Medical InstituteStanford University School of MedicineBeckman B177, 279 Campus 
Drive, Stanford CA 94305Phone: (650) 723-6431


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
  



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] tNCS incompatible with cell dimensions

2019-06-01 Thread Jrh Gmail
Dear Kevin
You could try reindexing into P1, then run Phaser and with its solution as 
input to Zanuda determine the space group. 
Best wishes,
John 

Emeritus Professor of Chemistry John R Helliwell DSc_Physics 




> On 31 May 2019, at 21:09, Kevin Jude  wrote:
> 
> Hello community, I wonder if I could solicit advice about a problematic 
> dataset. I plan to solve the structure by molecular replacement and expect 
> that the protein is relatively compact, ie not elongated. SAXS data supports 
> this expectation.
> 
> The crystals diffract to 2.6 Å resolution and appear to be in P 21 21 2 with 
> a = 49, b = 67, c = 94, which should fit <=2 molecules in the ASU with 40% 
> solvent. The native Patterson shows a large peak (12 sigma) suggesting a tNCS 
> vector of {0.5, 0.5, 0}.
> 
> If you're sharper than me, you may have already spotted the problem - c is 
> the long axis of the unit cell, but tNCS constrains the proteins to a plane 
> parallel to the a,b plane. Indeed, molecular replacement attempts using 
> Phaser will not give a solution in any orthorhombic space group unless I turn 
> off packing, and then I get large overlaps in the a,b plane and huge gaps 
> along c.
> 
> Since I believe that my model is good (or at least the correct shape, based 
> on SAXS), I wonder if I'm misinterpreting my crystallographic data. Any 
> insights into how to approach this problem would be much appreciated.
> 
> --
> Kevin Jude, PhD
> Structural Biology Research Specialist, Garcia Lab
> Howard Hughes Medical Institute
> Stanford University School of Medicine
> Beckman B177, 279 Campus Drive, Stanford CA 94305
> Phone: (650) 723-6431
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] tNCS incompatible with cell dimensions

2019-06-01 Thread Kevin Jude
Thanks Diana - Indexing on strong reflections (STRONG_PIXEL=50 in xds) does
identify C222 as a possibility with the same dimensions as the P222 cell.
This doesn't solve my problem, though, since the centering operation just
replaces the tNCS and doesn't relieve the crowding.

Best wishes
Kevin
--
Kevin Jude, PhD
Structural Biology Research Specialist, Garcia Lab
Howard Hughes Medical Institute
Stanford University School of Medicine
Beckman B177, 279 Campus Drive, Stanford CA 94305
Phone: (650) 723-6431


On Fri, May 31, 2019 at 1:35 PM Diana Tomchick <
diana.tomch...@utsouthwestern.edu> wrote:

> Your native Patterson indicates pseudo C-centering. Are you sure you don’t
> have space group C222(1)?
>
> If your space group is correct, it’s still pseudo C-centered. You should
> see that in the intensity-weighted reciprocal lattice.
>
> You could try re-indexing on just the most intense spots to give you a
> data set indexed in a C-centered lattice. Use that data to solve via MR,
> then convert to the data indexed in the actual space group.
>
> Diana
>
> **
> Diana R. Tomchick
> Professor
> Departments of Biophysics and Biochemistry
> UT Southwestern Medical Center
> 5323 Harry Hines Blvd.
> Rm. ND10.214A
> Dallas, TX 75390-8816
> diana.tomch...@utsouthwestern.edu
> (214) 645-6383 (phone)
> (214) 645-6353 (fax)
>
> On May 31, 2019, at 3:09 PM, Kevin Jude  wrote:
>
> Hello community, I wonder if I could solicit advice about a problematic
> dataset. I plan to solve the structure by molecular replacement and expect
> that the protein is relatively compact, ie not elongated. SAXS data
> supports this expectation.
>
> The crystals diffract to 2.6 Å resolution and appear to be in P 21 21 2
> with a = 49, b = 67, c = 94, which should fit <=2 molecules in the ASU with
> 40% solvent. The native Patterson shows a large peak (12 sigma) suggesting
> a tNCS vector of {0.5, 0.5, 0}.
>
> If you're sharper than me, you may have already spotted the problem - c is
> the long axis of the unit cell, but tNCS constrains the proteins to a plane
> parallel to the a,b plane. Indeed, molecular replacement attempts using
> Phaser will not give a solution in any orthorhombic space group unless I
> turn off packing, and then I get large overlaps in the a,b plane and huge
> gaps along c.
>
> Since I believe that my model is good (or at least the correct shape,
> based on SAXS), I wonder if I'm misinterpreting my crystallographic data.
> Any insights into how to approach this problem would be much appreciated.
>
> --
> Kevin Jude, PhD
> Structural Biology Research Specialist, Garcia Lab
> Howard Hughes Medical Institute
> Stanford University School of Medicine
> Beckman B177, 279 Campus Drive, Stanford CA 94305
> Phone: (650) 723-6431
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
>
>
> --
>
> UT Southwestern
>
> Medical Center
>
> The future of medicine, today.
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1