Jacob is right, there definitely seem to be problems with the data. Perhaps you and your supervisor should consider contacting privately the developers of  data processing programs that have participated in the thread like Kay and Harry (and others perhaps too) to try and get the best out of your data. There is a limit to what refinement programs can do when there is a real problem in the data which is not taken care of properly.

My 2p thoughts.

         Boaz

 
 
Boaz Shaanan, Ph.D.                                        
Dept. of Life Sciences                                     
Ben-Gurion University of the Negev                         
Beer-Sheva 84105                                           
Israel                                                     
                                                           
E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan                 
Fax:   972-8-647-2992 or 972-8-646-1710    
 
 
                


From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Chris Fage [cdf...@gmail.com]
Sent: Monday, February 24, 2014 12:52 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] High Rwork/Rfree vs. Resolution

Thanks again for the advice, everyone.

As suggested, I tried NCS and TLS in phenix.refine, although my R-factors did not budge.

I am now giving PDB_REDO and simulating annealing in PHENIX a shot. I am also looking into setting up XDS.

Forgive my ignorance, but I am not sure how to check whether the bulk solvent model is reasonable.

For these crystals, HKL2000 does invariably report high mosaicity along one axis (it is in the "red").

Yes, the structure was solved by MR. For the 1.65-angstrom map, the model is very complete, with density missing only for the N-terminal 6xHis tag and first three residues, as well as 5-10 other residues on flexible loops (the protein is ~300 residues, including the tag). Most side chains are well resolved. The quality of the 1.90-angstrom map is lower, with more gaps, more noise, and less side-chain coverage. In each map, there is no remaining density that legitimately needs to be filled.

I have attached representative frames and relevant details from the HKL2000 scale logs. (Note that the 1.65-A set was originally scaled to 1.53 A.)

As for making the datasets available before publication, I would have to check with my supervisor. The idea might not fly with him, as the structure is expected to be of relatively high impact.

Best,
Chris



On Sat, Feb 22, 2014 at 3:00 AM, Francis Reyes <francis.re...@colorado.edu> wrote:

>
> I'm guessing the low completeness of the 1.65 angstrom dataset has to do with obstacles the processing software encountered on a sizable wedge of frames (there were swaths of in red in HKL2000). I'm not sure why this dataset in particular was less complete than the others.


This is bad. Large swaths of red circles during integration is bad. I believe (check the Denzo manual) this means overlaps and overlaps get thrown out. Thus you are getting lower completeness. Was your oscillation range too large? Crystal very mosaic?

However this could be because of a poor crystal orientation matrix by HKL2000 which in some cases can be alleviated by mosflm and xds. (HKL2000 is much more manual, there's a lot of buttons, which means you can shoot yourself in the foot if you are not careful).

I would be particularly interested in a resolution bin breakdown in the integration and merging statistics. (I/sig and rmerge). You might as well post the refinement statistics (r and rfree) by resolution bin as well.

You have a smallish unit cell that shoots to high resolution and getting a reasonable completion of the low resolution bins is paramount.  Post the completeness of the 20-10A bin.

Is this molecular replacement? How complete is the model? Aside from the completeness of the model, how far is it from the target?

You mentioned that some regions of your crystal had smeary spots. This is also bad, particularly if the errors are not random  (I.e anisotropic along one axis). This will confuse ML refinement. Let's see a single frame of your data.

Cheers,
F



Reply via email to