I’ve been working on a troublesome protein structure.  The native protein forms 
crystals that diffract to 2.75A and belong to P212121 (55.179 64.316 233.748 
90.000 90.000 90.000) with 4 molecules in the ASU.  I have 3 versions of the 
same protein where selenomethionine mutations are incorporated at different 
positions.  Interestingly, these mutations cause the protein to form crystals 
belonging to C2221 (56.130 64.665 240.854 90.000 90.000 90.000).  Looking back 
at the native datasets, Xtraige indicates the largest Patterson peak is (0.5, 
0.486, 0), height is 11.8% of the origin peak, and p_value(height) is 0.08549, 
which is just outside of the threshold for being identified as containing 
pseudotranslation.  Datasets from a couple of the selenomet incorporated 
crystals yield diffraction to ~3.5A and anomalous signal to ~6.7A.  Some of the 
datasets give a solution with reasonable maps, but the best maps are achieved 
from combining MAD/SAD selenomet datasets and one from a mercury derivatized 
crystal using the ‘group’ command in Phenix.   
---------------------------------------------------------------------------
5:     
STEP: finished
Top solution: # 39 Dataset #0
BAYES-CC: 69.2 +/- 13.4   FOM: 0.6
Built: 219 Side-chains: 44 Chains: 9   CC: 0.74


 Score type:       SKEW    CORR_RMS    NCS_OVERLAP
 Raw scores:        0.41      0.82      0.00  
 100x EST OF CC:   69.17     43.34     31.25  

---------------------------------------------------------------------------
Maps from this solution show connected electron density that looks like 
helices, consistent with the predicted secondary structure.  Strangely, there 
is absolutely no side-chain density, only c-beta at most.  I can build a 
poly-ala model into the map and the distances between the heavy atom sites 
appear correct based upon the known positions of the selenomethionines and the 
single cysteine in the protein sequence.  However the model does not refine.  
R-free starts and remains near 0.45.  I’ve tried indexing in lower symmetry 
space groups (P2, C2, P1) and re-solving by molecular replacement, but the 
refinement still fails.

Xtriage does not indicate twinning.

Twinning and intensity statistics summary (acentric data):
Statistics independent of twin laws
  <I^2>/<I>^2 : 2.305
  <F>^2/<F^2> : 0.758
  <|E^2-1|>   : 0.788
  <|L|>, <L^2>: 0.501, 0.333
  Multivariate Z score L-test: 1.643

 The multivariate Z score is a quality measure of the given
 spread in intensities. Good to reasonable data are expected
 to have a Z score lower than 3.5.
 Large values can indicate twinning, but small values do not
 necessarily exclude it.

One possible clue as to what is going on comes from analysis of SOLVE results. 
I was analyzing whether SOLVE/PHENIX solutions were related with one another by 
various origin shifts and came across one particular SOLVE run from a SeMet SAD 
dataset in C2221 that gave good statistics for a solution FOM=0.57, 
Z-score=20.26, peak height between 7.1 and 10.8 for 4 SeMet sites to 3.8A).  
The maps, however, looked poor.  What was interesting, though, was that 2 of 
the Se sites matched well with where I was expecting the Se sites in one 
molecule in the asymmetric unit.  The other two matched with where I would 
expect the sites in the other molecule when the model is shifted by one half 
the unit cell distance along the ‘a’ or ‘b’ axis.

I’d appreciate any advice as to what might be happening, and how might I go 
about detecting the problem, and how to dealing with the data?

Brent Hamaoka
UC San Diego
9500 Gilman Drive 0375
La Jolla, CA 92093

Reply via email to