Re: [ccp4bb] How many is too many free reflections?
Dear Dusan, Following up on Gerard's comment, we also read your nice paper with great interest. Your method appears most useful for cases with a limited number of reflections (e.g., small unit cell and/or low resolution) resulting in 5% test sets with less than 1000 reflections in total. It improves the performance of your implementation of ML refinement for the cases that you described. However, we don't think that you can conclude that cross-validation is not needed anymore. To quote your paper, in the Discussion section: To address the use of R free as indicator of wrong structures, we repeated the Kleywegt and Jones experiment (Kleywegt Jones, 1995; Kleywegt Jones, 1997) and built the 2ahn structure in the reverse direction and then refined it in the absence of solvent using the ML CV and ML FK approaches. Fig. 9 shows that Rfree stayed around 50% and Rfree–Rwork around 15% in the case of the reverse structure regardless of the ML approach and the fraction of data used in the test set. These values indicate that there is a fundamental problem with the structure, which supports the further use of Rfree as an indicator. Thank you for reaffirming the utility of the statistical tool of cross-validation. The reverse chain trace of 2ahn is admittedly an extreme case of misfitting, and would probably be detected with other validation tools as well these days. However, the danger of overfitting or misfitting is still a very real possibility for large structures, especially when only moderate to low resolution data are available, even with today's tools. Cross-validation can help even at very low resolution: in Structure 20, 957-966 (2012) we showed that cross-validation is useful for certain low resolution refinements where additional restraints (DEN restraints in that case) are used to reduce overfitting and obtain a more accurate structure. Cross-validation made it possible to detect overfitting of the data when no DEN restraints were used. We believe this should also apply when other types of restraints are used (e.g., reference model restraints in phenix.refine, REFMAC, or BUSTER). In summary, we believe that cross-validation remains an important (and conceptually simple) method to detect overfitting and for overall structure validation. Axel Axel T. Brunger Professor and Chair, Department of Molecular and Cellular Physiology Investigator, HHMI Email: brun...@stanford.edu mailto:brun...@stanford.edu Phone: 650-736-1031 Web: http://atbweb.stanford.edu http://atbweb.stanford.edu/ Paul Paul Adams Deputy Division Director, Physical Biosciences Division, Lawrence Berkeley Lab Division Deputy for Biosciences, Advanced Light Source, Lawrence Berkeley Lab Adjunct Professor, Department of Bioengineering, U.C. Berkeley Vice President for Technology, the Joint BioEnergy Institute Laboratory Research Manager, ENIGMA Science Focus Area Tel: 1-510-486-4225, Fax: 1-510-486-5909 http://cci.lbl.gov/paul http://cci.lbl.gov/paul On Jun 5, 2015, at 2:18 AM, Gerard Bricogne g...@globalphasing.com wrote: Dear Dusan, This is a nice paper and an interestingly different approach to avoiding bias and/or quantifying errors - and indeed there are all kinds of possibilities if you have a particular structure on which you are prepared to spend unlimited time and resources. The specific context in which Graeme's initial question led me to query instead who should set the FreeR flags, at what stage and on what basis? was that of the data analysis linked to high-throughput fragment screening, in which speed is of the essence at every step. Creating FreeR flags afresh for each target-fragment complex dataset without any reference to those used in the refinement of the apo structure is by no means an irrecoverable error, but it will take extra computing time to let the refinement of the complex adjust to a new free set, starting from a model refined with the ignored one. It is in order to avoid the need for that extra time, or for a recourse to various debiasing methods, that the book-keeping faff described yesterday has been introduced. Operating without it is perfectly feasible, it is just likely to not be optimally direct. I will probably bow out here, before someone asks How many [e-mails from me] is too many? :-) . With best wishes, Gerard. -- On Fri, Jun 05, 2015 at 09:14:18AM +0200, dusan turk wrote: Graeme, one more suggestion. You can avoid all the recipes by use all data for WORK set and 0 reflections for TEST set regardless of the amount of data by using the FREE KICK ML target. For explanation see our recent paper Praznikar, J. Turk, D. (2014) Free kick instead of cross-validation in maximum-likelihood refinement of macromolecular crystal structures. Acta Cryst. D70, 3124-3134. Link to the paper you can find at “http://www-bmb.ijs.si/doc/references.HTML” best, dusan On Jun 5, 2015, at
Re: [ccp4bb] Cross-validation when test set is miniscule
Dear Derek, I suggest you try 10% for the test set. You should still be able to judge the effect of various restraints (or constraints) as long as you keep the same test set. If you switch test sets, and re-refine, Rfree might change as much as 2% for a test set consisting of 200 reflections - see Fig. 6 in ref. (A. T. Brunger, Free R value: Cross-validation in crystallography, Methods in Enzym. 277, 366-396, 1997). However, using the same test set may allow you to judge the best restraints protocol or weights. Axel PS: The Methods in Enzym. review also briefly discusses complete cross-validation. PPS: For refinement at very low resolution, see also: A.T.Brunger, P.D.Adams, P.Fromme, R.Fromme, M.Levitt, G.F. Schroder. Improving the accuracy of macromolecular structure refinement at 7 A resolution. Structure 20, 957-966 (2012). On Dec 20, 2014, at 1:05 AM, CCP4BB automatic digest system lists...@jiscmail.ac.uk wrote: Date:Fri, 19 Dec 2014 11:18:37 + From:Derek Logan derek.lo...@biochemistry.lu.se Subject: Cross-validation when test set is miniscule Hi everyone, Right now we have one of those very difficult Rfree situations where it's impossible to generate a single meaningful Rfree set. Since we're in a bit of a hurry with this structure it would be good if someone could point me in the right direction. We have crystals with 1542 non-H atoms in the asymmetric unit that diffract to only 3.6 Å in P65, which gives us a whopping 2300 reflections in total. 5% of this is only about 100 reflections. Luckily the protein is only a single point mutation of a wild type that has been solved to much better resolution, so we know what it should look like and I simply want to investigate the effect of different levels of conservatism in the refinement, e.g. NCS in xyz and B, group B-factors, reference model, Ramachandran restraints etc. However since the quality criterion for this is Rfree I'm not able to do this. I believe the correct approach is k-fold statistical cross-validation, but can someone remind me of the correct way to do this? I've done a bit of Googling without finding anything very helpful. Thanks Derek Derek Logan tel: +46 46 222 1443 Associate Professor mob: +46 76 8585 707 Dept. of Biochemistry and Structural Biology www.cmps.lu.sehttp://www.cmps.lu.se Centre for Molecular Protein Sciencewww.maxlab.lu.se/crystal Lund University, Box 124, 221 00 Lund, Sweden www.saromics.com Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor and Chair, Dept. of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031
[ccp4bb] Free Reflections as Percent and not a Number
We just had a chance to read this most interesting discussion. We would agree with Ian that jiggling or SA refinement may not be needed if refinement can in fact be run to convergence. However, this will be difficult to achieve for large structures, especially when only moderate to low resolution data are available. So, jiggling or SA refinement should help in these cases. We strongly recommend testing refinement for a particular case with and without jiggling or simulated annealing to determine if the same Rfree value can be achieved by just running refinement to convergence. We also would like to draw attention to the importance of resetting or jiggling individual atomic B-factors when switching test sets or starting from a structure that has been refined against all diffraction data. Individual atomic B-factors also confer model bias, and convergence of individual B-factor refinement can be sometimes difficult. As regards Dusan’s recent paper in Acta Cryst D, we would like to draw attention to our paper in Structure 20, 957-966 (2012). There we showed that cross-validation is useful for certain low resolution refinements were additional restraints (DEN restraints in that case) are used to prevent overfitting and effectively obtain a more accurate structure. All the refinements described in that paper were performed in torsion angle space only, producing perfect bond lengths and bond angles. Cross-validation made it possible to detect overfitting of the data when no DEN restraints were used. We believe this should also apply when other types of restraints are used (e.g., reference restraints in phenix.refine, REFMAC, or BUSTER). So, while the danger of overfitting diffraction data with an incorrect model may not be as great as it was 20 years ago (at high to moderate resolution) due to the availably of many important validation tools, the situation is very different at low resolution, where overfitting (even in torsion angle space) is still a very real possibility, and use of external data or restraints is essential. So, we believe that cross-validation remains an important (and conceptually simple) method to prevent overfitting and for overall structure validation. Best regards, Axel and Paul Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor and Chair, Dept. of Molecular and Cellular Physiology Stanford University Paul Adams Deputy Division Director, Physical Biosciences Division, Lawrence Berkeley Lab Division Deputy for Biosciences, Advanced Light Source, Lawrence Berkeley Lab Adjunct Professor, Department of Bioengineering, U.C. Berkeley Vice President for Technology, the Joint BioEnergy Institute Laboratory Research Manager, ENIGMA Science Focus Area On Nov 26, 2014, at 8:18 PM, dusan turk dusan.t...@ijs.si mailto:dusan.t...@ijs.si wrote: Hello guys, There is too much text in this discussion to respond to every part of it. Apart from “jiggle” in certain software like PHENIX and I believe in X-PLOR derivatives the word “shake” means the same. In the “MAIN” environment I use the word “kick” to randomly distort coordinates. It's first use introduced in the early 90’s was to improve the convergence of model refinement and minimization. I have seen it as a substitute to molecular dynamics under real or reciprocal crystallographic restraints (we call this simulated annealing or slow cooling) as it is computationally much faster. The procedure in MAIN is called fast cooling” because the atoms move only under the energy potential energy terms with no kinetics energy present. The “fast cooled” the structure is thus frozen - take from a high energy state to the one with the lowest potential energy reachable. In order to reach the lowest possible point in potential energy landscape the kick at the beginning of each cooling cycle is lowered. The initial kick coordinate size is typically from 0.8A and drops down each cycle down to 0. The experience shows values beyond 0.8 may not lead to recovery of chemically reasonable structure in every part of it. Towards the end of refinement the starting kick is typically reduced to 0.05. Apart from coordinates also B-factors can be kicked. Are the structures after “kick” cooling refinement the same as without the kick? My over two decades long experience shows that by kicking convergence of refinement is improved. The resulting structures can thus be different as the different repeating cooling cycles may shift them to a lower energy point. However, after the structure is refined (has converged), the different refinements will converge to approximately the same coordinates as Ian described. I assume the this is the numerical error of the different procedures. As to the use of different TEST sets we came to a different conclusion (see bellow). As to the claim(s) that kicking/jiggling/shaking does or does not remove the model bias the
Re: [ccp4bb] High Rwork/Rfree vs. Resolution
Chris, First, I would try NCS restraints even at ~ 2 A. Second, any outliers in your diffraction data set that might skew the R values? Third, have you checked that your refined bulk solvent model is reasonable? Axel On Feb 21, 2014, at 6:13 PM, Chris Fage cdf...@gmail.com wrote: Thanks for the assistance, everyone. For those who suggested XDS: I forgot to mention that I have tried Mosfim, which is also better than spot fitting than HKL2000. How does XDS compare to Mosflm in this regard? I am not refining the high R-factor structure with NCS options. Also, my unit cell dimensions are 41.74 A, 69.27 A, and 83.56 A, so there isn't one particularly long axis. I'm guessing the low completeness of the 1.65 angstrom dataset has to do with obstacles the processing software encountered on a sizable wedge of frames (there were swaths of in red in HKL2000). I'm not sure why this dataset in particular was less complete than the others. Thanks, Chris On Fri, Feb 21, 2014 at 6:41 PM, Chris Fage cdf...@gmail.com wrote: Dear CCP4BB Users, I recently collected a number of datasets from plate-shaped crystals that diffracted to 1.9-2.0 angstroms and yielded very nice electron density maps. There is no major density unaccounted for by the model; however, I am unable to decrease Rwork and Rfree beyond ~0.25 and ~0.30, respectively. Probably due to the more 2-dimensional nature of my crystals, there is a range of phi angles in which the reflections are smeared, and I am wondering if the problem lies therein. I would be grateful if anyone could provide advice for improving my refinement statistics, as I was under the impression that the R-factors should be ~5% lower for the given resolution. A few more pieces of information: -Space group = P21, with 2 monomers per asymmetric unit; -Chi square = 1.0-1.5; -Rmerge = 0.10-0.15; -Data were processed in HKL2000 and refined in Refmac5 and/or phenix.refine; -PHENIX Xtriage does not detect twinning, but hints at possible weak translational pseudosymmetry; -I was previously able to grow one atypically thick crystal which diffracted to 1.65 angstroms with Rwork/Rfree at 0.18/0.22. Unfortunately, the completeness of the dataset was only ~90%. Regards, Chris
Re: [ccp4bb] Why nobody comments about the Nobel committee decision?
Their work has inspired many of us, including my own work on crystallographic refinement. It is so wonderful that Michael, Arieh, and Martin have received this most deserving recognition. Congratulations!!! Axel Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor and Chair, Dept. of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463 On Oct 10, 2013, at 2:20 AM, Eleanor Dodson eleanor.dod...@york.ac.uk wrote: This is amazing - I am so glad the Nobel committee has recognised this ground breaking, rather unglamorous work requiring great intelligence, very hard work, and a lot of disappointments! Congratulations to them all, and to the whole field. Eleanor Dodson On 10 October 2013 09:26, Alexandre OURJOUMTSEV sa...@igbmc.fr wrote: Hello to everybody, Alex, it was a great idea to initiate the conversation sending congratulations to our colleagues ! Bob, it was another great idea, when congratulating the Winners, to remind us of the framework. As one of my colleagues pointed out, we shall also give a lot of credits to Shneior Lifson who was in the very origins of these works, ideas and programs (see the paper by M.Levitt The birth of computational structural biology, Nature Structural Molecuar Biology, 8, 392-393 (2001); http://www.nature.com/nsmb/journal/v8/n5/full/nsb0501_392.html ). Older crystallographers may remember a fundamental paper by Levitt Lifson (1969). With best wishes, Sacha Urzhumtsev -Message d'origine- De : CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] De la part de Sweet, Robert Envoyé : mercredi 9 octobre 2013 23:52 À : CCP4BB@JISCMAIL.AC.UK Objet : Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision? It deserves comment!! I've been too busy talking with my friends about it to think of CCP4. This morning on NPR I heard Karplus's name and started to whoop and holler, and by the time they got to Arieh I realized they had a Hat Trick!! It's a spectacular thing that this field should get recognition! An interesting feature to me is that, at least when I was following the field, these three use physics to do their work, modeling with carefully estimated spring constants, etc., and eventually QM results. Those who use phenomenology -- hydrophobic volumes, who likes to lie next to whom, etc. -- are extremely effective (you know who they are), and they deserve credit. But they (we, some years ago) stand on the shoulders of the achievements of these three. It's good to remember the late, great, Tony Jack, cut down before reaching his prime. Bob From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Nat Echols [nathaniel.ech...@gmail.com] Sent: Wednesday, October 09, 2013 5:31 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision? Levitt also contributed to DEN refinement (Schroder et al. 2007, 2010). -Nat On Wed, Oct 9, 2013 at 2:29 PM, Boaz Shaanan bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il wrote: Good point. Now since you mentioned contributions of the recent Nobel laureates to crystallography Mike Levitt also had a significant contribution through the by now forgotten Jack-Levitt refinement which to the best of my knowledge was the first time that x-ray term was added to the energy minimization algorithm. I think I'm right about this. This was later adapted by Axel Brunger in Xplor and other progrmas followed. Cheers, Boaz הודעה מקורית מאת Alexander Aleshin aales...@sanfordburnham.orgmailto:aales...@sanfordburnham.org תאריך: 10/10/2013 0:07 (GMT+02:00) אל CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK נושא [ccp4bb] Why nobody comments about the Nobel committee decision? Sorry for a provocative question, but I am surprised why nobody comments/congratulations laureates with regard to recently awarded Nobel prizes? However, one of laureates in chemistry contributed to a popular method in computational crystallography. CHARMM - XPLOR - CNS - PHENIX-… Alex Aleshin
Re: [ccp4bb] Why nobody comments about the Nobel committee decision?
Dear John, I surely hope that the recent Nobel Prize will encourage young people to get into into the fields of computational biology and chemistry. Moreover, X-ray sources are undergoing new exciting developments (e.g., XFELs) that require new computational approaches, as does cryo-EM. Cheers, Axel On Oct 10, 2013, at 11:05 AM, Jrh jrhelliw...@gmail.com wrote: Dear Sacha, Dear Colleagues, I also offer my congratulations to the Chemistry Nobellists of yesterday. A very exciting and significant event, which I enjoyed. I recall when my PhD student, Gail Bradbrook, spoke about our harnessing these exciting methods in our crystallographic and structural chemistry concanavalin A saccharide studies, to crystallographers, there was a wide spread of reactions. Ie from scepticism to shared excitement. As an example of Gail's work see eg http://pubs.rsc.org/en/content/articlelanding/1998/ft/a800429c/unauth#!divAbstract It is sometimes said that a Nobel Prize kills a field. I think we can say instead that it is mature. But, to couple with the discussion on peer review; there are weaknesses in conventional ie the usual peer review; it does not cope well with 'risk and adventure' results. post publication peer review is an interesting solution, which in my view should be tried. This bulletin board itself in fact is a great initiative, institution actually, which helps develops community views of results and trends. Just my two pennies worth, Greetings, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 10 Oct 2013, at 09:26, Alexandre OURJOUMTSEV sa...@igbmc.fr wrote: Hello to everybody, Alex, it was a great idea to initiate the conversation sending congratulations to our colleagues ! Bob, it was another great idea, when congratulating the Winners, to remind us of the framework. As one of my colleagues pointed out, we shall also give a lot of credits to Shneior Lifson who was in the very origins of these works, ideas and programs (see the paper by M.Levitt The birth of computational structural biology, Nature Structural Molecuar Biology, 8, 392-393 (2001); http://www.nature.com/nsmb/journal/v8/n5/full/nsb0501_392.html ). Older crystallographers may remember a fundamental paper by Levitt Lifson (1969). With best wishes, Sacha Urzhumtsev -Message d'origine- De : CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] De la part de Sweet, Robert Envoyé : mercredi 9 octobre 2013 23:52 À : CCP4BB@JISCMAIL.AC.UK Objet : Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision? It deserves comment!! I've been too busy talking with my friends about it to think of CCP4. This morning on NPR I heard Karplus's name and started to whoop and holler, and by the time they got to Arieh I realized they had a Hat Trick!! It's a spectacular thing that this field should get recognition! An interesting feature to me is that, at least when I was following the field, these three use physics to do their work, modeling with carefully estimated spring constants, etc., and eventually QM results. Those who use phenomenology -- hydrophobic volumes, who likes to lie next to whom, etc. -- are extremely effective (you know who they are), and they deserve credit. But they (we, some years ago) stand on the shoulders of the achievements of these three. It's good to remember the late, great, Tony Jack, cut down before reaching his prime. Bob From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Nat Echols [nathaniel.ech...@gmail.com] Sent: Wednesday, October 09, 2013 5:31 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision? Levitt also contributed to DEN refinement (Schroder et al. 2007, 2010). -Nat On Wed, Oct 9, 2013 at 2:29 PM, Boaz Shaanan bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il wrote: Good point. Now since you mentioned contributions of the recent Nobel laureates to crystallography Mike Levitt also had a significant contribution through the by now forgotten Jack-Levitt refinement which to the best of my knowledge was the first time that x-ray term was added to the energy minimization algorithm. I think I'm right about this. This was later adapted by Axel Brunger in Xplor and other progrmas followed. Cheers, Boaz הודעה מקורית מאת Alexander Aleshin aales...@sanfordburnham.orgmailto:aales...@sanfordburnham.org תאריך: 10/10/2013 0:07 (GMT+02:00) אל CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK נושא [ccp4bb] Why nobody comments about the Nobel committee decision? Sorry for a provocative question, but I am surprised why nobody comments
[ccp4bb] postdoctoral positions available for experimental and computational methods for XFEL-based crystallography
Postdoctoral Positions Available for Experimental and Computational Methods Developments for XFEL-based Crystallography Faculty and staff of the SLAC Accelerator National Laboratory and Stanford University are leading an effort to develop methods for structure determination of challenging biological systems with the LCLS X-ray free electron laser. The effort includes collaborations with Lawrence Berkeley National Laboratory, Univ. of California at Berkeley, Univ. of California at Los Angeles, and California Institute of Technology. Particular emphasis is on small crystal characterization, novel sample delivery systems, data processing, experimental phasing, and refinement at low resolution. Extensive experience in computational or experimental crystallography is desirable. Candidates should submit a resume and a list of three references to Axel Brunger, Chair of the Bioimaging Working Group at SLAC, brun...@stanford.edu. Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] First images of proteins and viruses caught with an X-ray laser
I performed DEN-refinement with CNS using the same data and starting model, and obtained similar twinned R values and maps. The twin fraction is 0.5. Axel On Feb 9, 2011, at 3:29 PM, Jon Schuermann wrote: According to the paper, the data was refined in REFMAC in 'twin mode' which, I believe, calculates the R-factor using a non-conventional R-factor equation which usually lower than the conventional R-factor. I believe this is dependent on the twin fraction which wasn't mentioned in the paper (or supplementary info) unless I missed it. Jon -- Jonathan P. Schuermann, Ph. D. Beamline Scientist NE-CAT, Building 436E Advanced Photon Source (APS) Argonne National Laboratory 9700 South Cass Avenue Argonne, IL 60439 email: schue...@anl.gov Tel: (630) 252-0682 Fax: (630) 252-0687 On 02/09/2011 05:11 PM, James Holton wrote: This was molecular replacement from 1jb0, so the phases came from the model. Probably more properly called direct refinement since all we did was a few cycles of rigid body. Personally, I was quite impressed by how good the R factors were, all things considered. -James Holton MAD Scientist On Wed, Feb 9, 2011 at 2:56 PM, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com wrote: Any idea where then phases came from? BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Thomas Juettemann Sent: Wednesday, February 09, 2011 12:16 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] First images of proteins and viruses caught with an X-ray laser Thank you for clarifying this James. Those details are indeed often lost/misinterpreted when the paper is discussed in journal club, so your comment was especially helpful. Best wishes, Thomas On Wed, Feb 9, 2011 at 20:38, James Holton jmhol...@lbl.gov wrote: As one of the people involved (I'm author #74 out of 88 on PMID 21293373), I can tell you that about half of the three million snapshots were blank, but we wanted to be honest about the number that were collected, as well as the minimum number that were needed to get a useful data set. The blank images were on purpose, since the nanocrystals were diluted so that there would be relatively few double-hits. As many of you know, multiple lattices crash autoindexing algorithms! Whether or not a blank image or a failed autoindexing run qualifies as conforming to our existing model or not I suppose is a matter of semantics. But yes, I suppose some details do get lost between the actual work and the press release! In case anyone wants to look at the data, it has been deposited in the PDB under 3PCQ, and the detailed processing methods published under PMID: 20389587. -James Holton MAD Scientist On 2/9/2011 10:38 AM, Thomas Juettemann wrote: http://www.nanowerk.com/news/newsid=20045.php http://home.slac.stanford.edu/pressreleases/2011/20110202.htm I think it is pretty exciting, although they only take the few datasets that conform to their existing model: The team combined 10,000 of the three million snapshots they took to come up with a good match for the known molecular structure of Photosystem I.
Re: [ccp4bb] FW: [ccp4bb] Resolution and distance accuracies
We defined super-resolution in our DEN paper as achieving coordinate accuracy better than the resolution limit d_min of the diffraction data. We proposed this definition in analogy to its use wide-spread use in optical microscopy: super-resolution methods such as STORM, PALM, and STED achieve accuracy of positions of fluorescent labels significantly better than the diffraction limit (in some cases, sub-nanometer accuracy - Pertsinidis, Zhang, Chu, Nature 466, 647-651, 2010). We found DEN to be useful to move some atoms into correct positions in cases where electron density maps are difficult or impossible to interpret at low resolution. By default, DEN is active during the first torsion angle molecular dynamics stages, but then turned off during the last two stages. In addition, the DEN network is deformable. Thus, DEN is very different from secondary structure restraints or point restraints to reference models which are on all the time. Rather, DEN steers or guides the torsion angle conformational search process during refinement. Cheers, Axel On Dec 24, 2010, at 2:14 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: I find the super-resolution claims in this paper a bit of a conjuring trick. I think it is understood that information cannot come from nothing. You cannot cheat in basic physics. Interestingly, I had the same discussion with bioinformatics colleagues a short time ago. The problem is the same and seems of a semantic nature. They are using prior information of some sort (undisclosed) to successfully improve maps and they suggested to call this 'resolution increase'. I had the same objection and said that in crystallography resolution is a relatively hard term defined by the degree to which experimental observations are available, and as crystallographers we won't like that claim at all. On the other side it is uncontested that as long as the model fits (crossvalidation-) data better when prior information is used, something useful has been achieved - again with all the caveats of weights and bias etc admitted. However, how to entice non-experts to actually use new methods is another thing, and here the semantics come in. In essence, if at the end it results in better structures, how much of the unfortunately but undeniably necessary salesmanship is just right or acceptable? Within contemporary social constraints (aka Zeitgeist) that remains pretty much an infinitely debatable matter.. Merry Christmas, BR -- Dear Bernhard, I must say that I find the super-resolution claims in this paper a bit of a conjuring trick. If the final refined model has greater accuracy than one would expect from the resolution of the data it has been refined against, it is because that extra accuracy has been lifted from the higher resolution data that were used to refine the structure on the basis of which the elastic network restraints were created. Should we then say that we achieve super-resolution whenever we refine a macromolecular structure using Engh Huber restraints, because these enable us to achieve distance accuracies comparable with those in the small molecules structures in the Cambridge Structural Database? Perhaps I have missed an essential point of this paper. With best wishes, Gerard. Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] FW: [ccp4bb] Resolution and distance accuracies
Dear Gerard, Actually, for some of the tests we turned off the DEN network restraints for the last two refinement macrocycles, so, at least for these particular cases, the DEN method truly found a better minimum rather than forcing the system to the higher resolution structure. Cheers, Axel On Dec 23, 2010, at 3:04 PM, Gerard Bricogne wrote: Dear Bernhard, I must say that I find the super-resolution claims in this paper a bit of a conjuring trick. If the final refined model has greater accuracy than one would expect from the resolution of the data it has been refined against, it is because that extra accuracy has been lifted from the higher resolution data that were used to refine the structure on the basis of which the elastic network restraints were created. Should we then say that we achieve super-resolution whenever we refine a macromolecular structure using Engh Huber restraints, because these enable us to achieve distance accuracies comparable with those in the small molecules structures in the Cambridge Structural Database? Perhaps I have missed an essential point of this paper. With best wishes, Gerard. -- On Thu, Dec 23, 2010 at 12:25:26PM -0800, Bernhard Rupp (Hofkristallrat a.D.) wrote: Oops I am outdated: Axel just emailed me that he describes an improved coordinate estimate beyond the Rayleigh criterion in his recent paper Schroder GF, Levitt M, Brunger AT (2010) Super-resolution biomolecular crystallography with low-resolution data. Nature 464(7292), 1218-1222. For the deformable elastic network (DEN) refinement, see his ref 14 Schroder GF, Brunger AT, Levitt M (2007) Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. Structure 15(12), 1630-1641. BR -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
[ccp4bb] Announcing CNS, version 1.3
== Announcing version 1.3 (general release) of the software: Crystallography NMR System (CNS) (copyright 1997-2010, Yale University) == Information about the software and instructions for downloading the most recent version are available at: ** http://cns-online.org ** The software is available for download, free-of-charge, by all academic (non-profit) users. -- Installation instructions and documentation can be found in: $CNS_SOLVE/doc/html/cns_solve.html once you have downloaded and installed the software. -- Do not distribute CNS to third parties without approval. By downloading the software you agree to the License in the FTP directory. -- While this new version is compatible with version 1.2 input files it is highly recommended that you use the new input files and modules in conjunction this new version of CNS. The http://cns-online.org website is now pointing to the new 1.3 version by default. -- Attached release notes for CNS version 1.3: === = = Crystallography NMR System = = A.T.Brunger, P.D.Adams, G.M.Clore, W.L.DeLano, = P.Gros, R.W.Grosse-Kunstleve,J.-S.Jiang,J.M.Krahn, = J.Kuszewski, M.Nilges, N.S.Pannu, R.J.Read, = L.M.Rice, G.F.Schroeder, T.Simonson, G.L.Warren. = = Copyright (c) 1997-2010 Yale University = === Program: CNS Version: 1.3 Patch level: 0 Status: general release Changes for version 1.3 Summary of major changes and new features for version 1.3 - - DEN method for low resolution refinement. See the new tutorial: Structure refinement at low resolution (below ~ 3.5 A resolution). http://cns-online.org/v1.3/tutorial/refinement_low_resolution/den_refinement/text.html (Reference: G.F. Schroeder, M. Levitt, and A.T. Brunger, Super-resolution biomolecular crystallography with low-resolution data, Nature 464, 1218-1222, 2010) - Simulations of single molecule FRET-derived distances and docking calculations with FRET-derived distances. See the new tutorial: Single molecule FRET. http://cns-online.org/v1.3/tutorial/fret/docking_calculations/text.html http://cns-online.org/v1.3/tutorial/fret/dye_simulations/text.html http://cns-online.org/v1.3/tutorial/fret/fret_distribution_calculations/text.html (References: M. Vrljic, P. Strop, J.A. Ernst, R.B. Sutton, S.Chu, A.T. Brunger, Molecular mechanism of the synaptotagmin-SNARE interaction in Ca2+ -triggered vesicle fusion, Nature Structural and Molecular Biology 17, 325-331, 2010; U.B. Choi, P. Strop, M. Vrljic, S. Chu, A.T. Brunger, K.R. Weninger, Single-molecule FRET-derived model of the synaptotagmin 1-SNARE fusion complex. Nature Structural and Molecular Biology 17, 318-324, 2010). - Automatic generation (on-the-fly) of molecular topology (mtf) files (see the tutorial in Generating PDB and MTF files). Note that this new feature has required that wildcards are not allowed anymore in the topology link files. Also, all residues need to be defined in the topology files. - Greatly expanded, general refinement script file (inputs/xtal_refine/refine.inp and inputs/xtal_twin/refine_twin.inp) for most refinement tasks, including positional (xyz), B-factor refinement, simulated annealing, and DEN refinement. The new script also writes 2Fo-Fc and Fo-Fc electron density maps in CNS/X-PLOR format as well as a coefficient file (.hkl) that can be read directly by Coot, version 0.6.1, or later (or by the CNS script fourier_map.inp for B-factor sharpening). - new all-hydrogen topology and parameter files (protein-allhdg5-4*, and dna-rna-allatom-hj-opls.*). These files are used for NMR structure determination, but they can also be used for X-ray structure refinement to include all hydrogen atoms in the refinement (see the new tutorials on this topic).
Re: [ccp4bb] Compilation of CNS 1.21 on Mac OSX 10.6.3
Please note that the latest version of Xcode in Mac OS X 10.6.3 breaks the Intel Fortran and C compilers. It is possible that Xcode also breaks gfortran, but I haven't tested it. I've reverted to the Xcode version that came with the original Snow Leopard release. Intel is aware of the problem, but so far there have been no updates of the Intel compilers to fix the problem with Xcode. Axel Brunger On Jun 9, 2010, at 11:35 AM, Ethan Merritt wrote: On Wednesday 09 June 2010 10:45:07 am James Holton wrote: I have often wondered how it is that one can actually run and play games like Pac-Man(R) on a modern PC using the actual bit-for-bit contents of the EPROM cartridges that I used to put into my Atari 2600 (circa 1982), but for some reason programs written just a few years ago will neither compile nor run on the latest and greatest linux/gcc systems. I seriously doubt that your Atari PacMan program was written in Fortran! But more to the point, to run it now you run an emulator for the original Atari chipset. PacMac runs around thinking he really is on an Atari console, blissfully unaware that the console is only an emulated simulation of the original long-gone world. Welcome to the matrix. Am I missing something? I think you are missing the mark twice in mentioning linux/gcc. The complaint under discussion is with OSX, not a linux system. With rare exceptions, old linux programs run just fine on an up-to-date linux system, so long as you also install equally old versions of the support libraries. Do you have a counter-example in mind? In the specific case of old Fortran programs, the reality is that in the era of commercial Fortran compilers there was great divergence in the details of the implementation, particular with regard to I/O commands. gcc/f77/g77 was never a good Fortran compiler, and was particularly bad at compiling code idioms used by real-world Fortran code written for compilers supported by IBM, DEC, CDC, etc. gfortran is somewhat better, but still far from perfect. As to Pryank's problem: Pryank Patel wrote: Hi all, I've posted this in the hope that somebody in the CCP4 community may have come across this problem and can shed some light. I've posted this question on other lists (cnsbb, ccpnmr and aria - the reason will become clear), but with no success so far. I have recently acquired a Macbook Pro running OSX 10.6.3, (Kernel version 10.3.0) and am unable to compile cns v1.21 from source, using either the gcc 4.2.1/4.4/4.5 compilers (4.4 and 4.5 installed using fink), and the Intel 11.1 (evaluation) compilers. I may be mis-remembering, but I have it in mind that the cns source code requires selecting or porting a set of compiler-specific routines in one of the source modules. These are work-arounds for the variability in Fortran implementations mentioned above. Did you tweak this file appropriately for each of the compilers you tried? As a practical matter, you might want to look into running a VMWare layer on your machine so that you can compile and run linux executables rather than fighting the native OSX environment. You too can join PacMac in the happy world recreated in the matrix :-) Ethan I am aware that there are Mac OSX binaries available, but I am also using CNS for NMR structure calculation with the Aria 2.3 program, and to run that successfully CNS needs to be re-compiled with Aria-specific source code. With the gcc4.5 compilers, CNS compiles and links with no warnings or errors, but fails at the execution stage. When I try to execute cns, either with './cns' or by running one of the test scripts, I get the following: dmemory error code = ** %ALLHP error encountered: fatal coding error (CNS is in mode: SET ABORT=NORMal END) * ABORT mode will terminate program execution. * Program will stop immediately. Maximum dynamic memory allocation: 0 bytes Maximum dynamic memory overhead: 8 bytes Program started at: on Program stopped at: 14:32:05 on 07-Jun-2010 CPU time used: 0.0036 seconds With 4.2.1 (using gfortran), CNS fails at the linking stage with Undefined symbols: errors. With 4.4, CNS compiles successfully, but when executed produces a simple segmentation fault message. With the 11.1 Intel compilers, CNS compiles successfully, but fails on execution: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source cns00010029C7BE _xtarmoin_ 1813 xdeclare.f cns
Re: [ccp4bb] Compilation of CNS 1.21 on Mac OSX 10.6.3 - Solved
A PS on the Intel compiler / Xcode issue: the first work-around (-use-asm) does not work in conjunction with OpenMP enabled. Thus, I recommend to revert to Xcode 3.2.1 (and then reinstall the compilers) until the issue has been fixed by Intel. Axel On Jun 9, 2010, at 5:19 PM, Patel, Pryank wrote: Hi all, Thanks to everybody who has contributed over the past couple of days on the various bulletin boards I have posted to. As is always the case, the solution was quite simple but completely passed me by. I'm going to use Mac inexperience as an excuse here... :-) So the original problem was not being able to compile a working executable of CNS v1.21 from source on Mac OSX 10.6.3. The reason for compiling from source is because in order to run CNS from Aria 2.3, a program used in NMR automated peak assignment and structure calculation, CNS needs to be recompiled with Aria-specific source code. With fink-installed gfortran/gcc 4.5 compilers, CNS compiles and links with no warnings or errors, but fails at the execution stage. With fink-installed 4.4, CNS compiles successfully but when executed produces a simple segmentation fault message. The problem here is the fink-installed compilers. Harry Powell and Daniel O'Donovan have guided me to the light - the High Performance Computing for Mac OS X webpage (http://hpc.sourceforge.net) has binaries for the gcc 4.5 compiler package, which installs in /usr/local/. Compilation with this compiler set produces a working CNS executable. The only other modification is to copy the Makefile.header.2.gfortran file from cns_solve_1.21/instlib/machine/supported/intel-x86_64bit-linux to cns_solve_1.21/instlib/machine/supported/intel-x86_64bit-linux. Compile with 'make install compiler=gfortran', making sure that /usr/local/path is in $PATH. With the Intel 11.1 compilers (and I used the evaluation version here), CNS compiles successfully, but fails on execution. Thanks to Axel Brunger and Benjamin Bardiaux, for sharing that the Intel fortran compiler is not compatible with Xcode 3.2.2, although it does not seem to break gfortran in this case. A little more information can be found here: http://software.intel.com/en-us/articles/intel-fortran-for-mac-os-x-incompatible-with-xcode-322/ Thanks goes to Benjamin Bardiaux for the link. The webpage suggests two fixes. One is to add the '-use-asm' option to both the compilation and linker lines. This seems to work, and produces a working CNS executable. Not until now did I consider Xcode to be the problem, and at no point over the past few days did I come across or register the intel webpage linked above during my hours of google-trawling. The other option is to reinstall Xcode 3.2.1, which I assume is the version Axel Brunger said he reinstalled. This can be downloaded from the Mac Development Centre. Then reinstall the Intel fortran compiler. I have not tried this option, since the first option seems to have worked quite well. Thanks once again, Pryank On 9 Jun 2010, at 19:54, Axel Brunger wrote: Please note that the latest version of Xcode in Mac OS X 10.6.3 breaks the Intel Fortran and C compilers. It is possible that Xcode also breaks gfortran, but I haven't tested it. I've reverted to the Xcode version that came with the original Snow Leopard release. Intel is aware of the problem, but so far there have been no updates of the Intel compilers to fix the problem with Xcode. Axel Brunger On Jun 9, 2010, at 11:35 AM, Ethan Merritt wrote: On Wednesday 09 June 2010 10:45:07 am James Holton wrote: I have often wondered how it is that one can actually run and play games like Pac-Man(R) on a modern PC using the actual bit-for-bit contents of the EPROM cartridges that I used to put into my Atari 2600 (circa 1982), but for some reason programs written just a few years ago will neither compile nor run on the latest and greatest linux/gcc systems. I seriously doubt that your Atari PacMan program was written in Fortran! But more to the point, to run it now you run an emulator for the original Atari chipset. PacMac runs around thinking he really is on an Atari console, blissfully unaware that the console is only an emulated simulation of the original long-gone world. Welcome to the matrix. Am I missing something? I think you are missing the mark twice in mentioning linux/gcc. The complaint under discussion is with OSX, not a linux system. With rare exceptions, old linux programs run just fine on an up-to-date linux system, so long as you also install equally old versions of the support libraries. Do you have a counter-example in mind? In the specific case of old Fortran programs, the reality is that in the era of commercial Fortran compilers there was great divergence in the details of the implementation, particular with regard to I/O commands. gcc/f77/g77 was never a good Fortran compiler
Re: [ccp4bb] CNS: Composite omit map
I suggest you explicitly define the heterocompound as a rigid body for torsion dynamics. There are sometimes situations involving closed ring systems that cause this topology generation error. Axel Brunger On Mar 20, 2010, at 3:46 AM, Suman Tapryal wrote: Hi all, I am trying to calculate composite omit map using CNS. My structure has a bound molecule of tris. I have generated CNS topology and parameter files using PRODRG server. The files seem to work fine when I am doing minimizations and other refinement operations. However, I am getting the following errors while calculating composite omit map: ERROR: There are no suitable base groups. This problem can be caused by isolated bonding networks with undefined or weak dihedral force constants. The atoms that cannot be placed in a tree are listed below: %atoms T -1 -TRS -O1 %atoms T -1 -TRS -C1 %atoms T -1 -TRS -C %atoms T -1 -TRS -N %atoms T -1 -TRS -C3 %atoms T -1 -TRS -O3 %atoms T -1 -TRS -C2 %atoms T -1 -TRS -O2 %TORSION:TOPOLOGY error encountered: Fatal Topology Error (CNS is in mode: SET ABORT=NORMal END) Am I missing out something? Regards Suman Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
[ccp4bb] Memorial Service for Warren DeLano
Begin forwarded message: From: Charles Wolfus cwol...@gmail.com Subject: Memorial Service for Warren DeLano In case you have not already heard, the DeLano family is having a memorial service for Warren: Sunday, February 7th @ 10am The Lucie Stern Center 1305 Middlefield Road Palo Alto, 94301 Please join us in remembering Warren, his special gifts, and accomplishments. Also, please freely forward this message to any others who may want to join us. Best Regards, Friends and Family of Warren DeLano Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] tr: Warren L Delano Memorial Award
Dear Fred, Thank you for posting it to the newsgroup. The DeLano family has setup a PayPal account to deposit a donation into the fund. See http://www.jmdelano.com/. Best regards, Axel On Nov 14, 2009, at 9:10 PM, Frederic VELLIEUX wrote: Dear all, I have received this from Axel. I did not see the [ccp4bb] tag on it, so I pass it on. I think that this is important, since Warren has done so much for us all. I'll see how I can contribute (I hope it is easily done from abroad). Fred. Message du 15/11/09 01:09 De : Axel Brunger A : undisclosed-recipients:; Copie à : Objet : Warren L Delano Memorial Award Dear friends and colleagues: It's now been over a week since Warren has passed away. We are trying to move toward a permanent way to honor Warren's memory and what he stood for: Open Source Computational Biosciences and molecular visualization. To do this, Jim Wells and I put together a mission statement with the approval of Warren's family: The Warren L. DeLano Memorial Award for Computational Biosciences This award shall be given to a top computational bioscientist in recognition of the contributions made by Warren L. DeLano to creating powerful visualization tools for three dimensional structures and making them freely accessible. The award, accompanying lecture, and honorium will be given annually in the context of a national bioscience meeting or a Bay Area gathering of computational bioscientists at Stanford, UCSF or UC Berkeley. For the award special emphasis will be given for Open Source developments and service to the bioscience community. The award selection committee, consisting of experts in the computational and biological sciences, will accept nominations from anyone.. To make something like this happen in perpetuity would take about ~100K for the endowment. For donations, Warren's family has set up a tax deductible fund: Silicon Valley Community Foundation memo: Warren L. DeLano Memorial Fund 2440 West El Camino Real, Suite 300 Mountain View, CA 94040 tel: 650.450.5400 We hope that you'll consider making a contribution (not matter how small) in Warren's honor. Also, please forward this message to anybody who might be able be willing to contribute. Best regards, Axel Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web: http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax: +1 650-745-1463 Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] Warren DeLano
Dear Linda, NSMB asked me and Jim Wells to write an obit for Warren. With your permission we would like to include a quote of your CCP4 posting (or a reference to it - depending what NSMB decides to do). Warren's words summarize so much how we call remember him. I hope you are doing well. All the best, Axel On Nov 5, 2009, at 11:24 AM, Linda Brinen wrote: I - like so many others - are shocked, saddened and shaken by this news. Warren's passing is a great loss to his friends, family and to our scientific community. Nearly exactly one year ago, when an MRI found that I had a brain tumor, Warren wrote me an e-mail, part of which I will share here, because it sums up part of him and his approach to life: ...I am so sorry to read your startling news. Not a one of us is excused from life-altering biology and random accidents, any of which can strike suddenly without warning. For that reason, we must never take anything for granted. Not a single day. Not a single friend. But as you well know, there are only two things we can do in defiance of chance, whether in sickness or in health: 1. Do everything you feel is important in life, today, or as soon as possible. 2. Never give up. Ignore the odds. Always believe you will survive and thrive. .I am personally counting on you to get through this just fine and be back in action Warren will be remembered well...and I wish for him to be at peace. -Linda -- Linda S. Brinen, PhD Adjunct Assistant Professor Dept of Cellular Molecular Pharmacology and The Sandler Center for Basic Research in Parasitic Diseases Phone: 415-514-3426 FAX: 415-502-8193 E-mail: bri...@cmp.ucsf.edu QB3/Byers Hall 508C 1700 4th Street University of California San Francisco, CA 94158-2550 USPS: UCSF MC 2550 Byers Hall Room 508 1700 4th Street San Francisco, CA 94158 Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
[ccp4bb] Warren DeLano
Dear CCP4 Community: I write today with very sad news about Dr. Warren Lyford DeLano. I was informed by his family today that Warren suddenly passed away at home on Tuesday morning, November 3rd. While at Yale, Warren made countless contributions to the computational tools and methods developed in my laboratory (the X-PLOR and CNS programs), including the direct rotation function, the first prediction of helical coiled coil structures, the scripting and parsing tools that made CNS a universal computational crystallography program. He then joined Dr. Jim Wells laboratory at USCF and Genentech where he pursued a Ph.D. in biophysics, discovering some of the principles that govern protein-protein interactions. Warren then made a fundamental contribution to biological sciences by creating the Open Source molecular graphics program PyMOL that is widely used throughout the world. Nearly all publications that display macromolecular structures use PyMOL. Warren was a strong advocate of freely available software and the Open Source movement. Warren's family is planning to announce a memorial service, but arrangements have not yet been made. I will send more information as I receive it. Please join me in extending our condolences to Warren's family. Sincerely yours, Axel Brunger Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] unstable refinement
Dear Ian, I totally agree with your observations and recommendations. If one is concerned about instability of the optimizer (minimization and/or simulated annealing) I suggest to also monitor the value of the total energy function (X-ray maximum likelihood term plus all restraints). Another source for slight variations in R values can occur after recalculation of the bulk solvent mask and model parameters if the model has significantly moved between solvent mask updates. Axel On Feb 16, 2009, at 6:21 AM, Ian Tickle wrote: Dear George I would still maintain that values of Rfree where the refinement had not attained convergence are totally uninformative, so I would say you made the right call! During a refinement run, Rfree is often observed to fall initially and then increase towards the end, though usually not significantly. One cannot deduce anything from this behaviour, and indeed it is not at all surprising: since Rfree is not the target function of the optimisation (or even correlated with it) there's no reason why it should do anything in particular. Exactly the same applies to Rwork: because it's a completely different function from the target function (it contains no weighting information for one thing), there's absolutely no reason why Rwork should be a minimum at convergence (even in the case of unrestrained refinement, and even though it surely is correlated with the target function). If that were true we would be able to use Rwork as the target function! The test for overfitting can only be done if you have at least 2 refinement runs done with different protocols (e.g. no of waters added) to compare: the one with the higher Rfree (or lower free likelihood) at convergence is overfitted. Note that this is a relative test: you can never be sure that a particular model is not overfitted. It's always possible for someone to come along in the future using a different parameter set (or different weighting) and produce a lower Rfree than you did (using the same data of course), making your model overfitted after the fact! Cheers -- Ian -Original Message- From: George M. Sheldrick [mailto:gshe...@shelx.uni-ac.gwdg.de] Sent: 16 February 2009 11:24 To: Ian Tickle Cc: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] unstable refinement Dear Ian, That was in fact one of my reasons for only calculating the free R at the end of a SHELXL refinement run (the other reason, now less important, was to save some CPU time). I have to add that I am no longer completely convinced that I made the right decision all those years ago. A stable refinement in which R decreases but Rfree goes through a minimum and then starts to rise might be a useful indication of overfitting?! Best wishes, George Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-22582 On Mon, 16 Feb 2009, Ian Tickle wrote: Clemens, I know we've had this discussion several times before, but I'd like to take you up on the point you made that reducing Rfree-R is necessarily always a 'good thing'. Suppose the refinement had started from a point where Rfree was biased, e.g. the test set in use had previously been part of the working set, so that Rfree-R was too small. In that case one would hope and indeed expect that Rfree-R would increase on further refinement now excluding the test set. Shouldn't the criterion be that Rfree-R should attain its expected value (dependent of course on the observation/parameter ratio and the weighting parameters), so a high value of |(Rfree-R) - Rfree-R| is bad, i.e. any significant deviations of (Rfree-R) from its expectation are bad? I would go further than that and say that anyway Rfree is meaningless unless the refinement has converged, i.e. reached its maximum (local or global) total likelihood (i.e. data+restraints). So one simply cannot compare the Rfree (or Rfree-R) values at the beginning and end of a run. The purpose of Rfree (or better free likelihood) is surely to compare the *results* of *different* runs where convergence has been attained and where the *refinement protocol* (i.e. selection of parameters to vary and weighting parameters) has been varied, and then to choose as the optimal protocol (and therefore optimal result) the one that gave the lowest Rfree (or highest free likelihood). Rfree-R is then used as a subsidiary test to verify that it has attained its expected value, if not then something is wrong, i.e. either the refinement didn't converge (Rfree-R lower than Rfree-R) or there are non-random errors (Rfree-R higher than Rfree-R), or a combination of factors. Cheers -- Ian -Original Message- From: owner-ccp...@jiscmail.ac.uk [mailto:owner-ccp...@jiscmail.ac.uk] On Behalf Of Clemens Vonrhein Sent: 13 February 2009 17:15 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] unstable
Re: [ccp4bb] Crystallographic computing platform recommendations?
Brian, There is one disadvantage with using AFP rather than wide open (potentially insecure) NFS mounts. Remote login via ssh into a client computer won't by default mount the user's AFP home directory. While it is possible to manually mount the AFP home directory it may preclude other users from using the client computer from the console. This feature of AFP is due to user-specific mounting of the remote disk on the client computer. I assume the same feature would apply to Kerberized NFS mounts, but I haven't tried it. This limitation of AFP requires some thought when using idle client computers as compute servers. We're using the Mac Server Xgrid service, along with the freely available GridStuffer.app application to make submission of batch jobs to all our Macs relatively easy. Axel On Nov 18, 2008, at 9:10 PM, Brian Mark wrote: Francis, From your response and others to my question about OS X server 10.5, AFP seems to be the preferred networking protocol over NFS. Yes, in our case the RAID is connected to a G5 (via firewire 800 - which provides surprisingly good transfer rates BTW) that is running OS X server 10.5 . I'll try AFP for the user home directories. Thanks, Brian Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] FW: [ccp4bb] X-Ray versus NMR Structure
Correct, this special NMR issue of NSMB does not have any pdfs for some reason. The presubmission manuscript can be downloaded from my site: http://atbweb.stanford.edu/scripts/papers.php?sendfile=162 On Nov 18, 2008, at 8:26 PM, Kantardjieff, Katherine wrote: In regards to the reference below, it does not appear to be in the online archive of NSMB... Those pages are mysteriously missing between volumes 10 and 11. You might find this review useful: X-ray Crystallography and NMR: Complementary Views of Structure and Dynamics, Nature Structural Biology 4, 862-865 (1997). The pre-publication manuscript is available on my website, publication section. Axel Brunger On Nov 14, 2008, at 3:13 PM, Boaz Shaanan wrote: oops, I forgot to give the reference: Science (1992), vol. 257, p. 961 Boaz Axel T. Brunger Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463 Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] Refinement using Phenix
Phenix includes TLS refinement which might explain the lower R values. Without TLS, Phenix and CNS should produce very similar R values. Are you using the latest version of CNS (1.21)? A composite annealed omit map will be less model biased than standard sigmaA weight maps, but this is dependent on the particular case. Thus, it's useful to compute both types of maps. I don't see the need to compute a composite annealed omit map every time you make a change in model, though. Axel Brunger On Nov 15, 2008, at 10:34 AM, Priya Mudgal wrote: Dear All, I have a general question for refinement using Phenix. Since the program gives much lower R/Rfree compared to CNS I am using it. My concern is, how model biased the 2fofc map is? After every refinement cycle it gives the mtz files corresponding to 2fofc and fofc maps. Is it alright to use that or do I have to calculate the composite omit map using CNS everytime. I am in my final stages of refinement now. My current R/Rfree is 25%/28%. Thanks for your suggestions. Regards, Priya -- Priya Mudgal PhD Student Duke University Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] CNS refinement problem
This sounds to me like a compiler problem on your machine. Try using the latest Intel Ifort, Portland Group pgf95, or GFORTRAN compiler. Axel Brunger On Sep 12, 2008, at 2:56 AM, jxqi wrote: Dear ccp4bbs, When I was refining a protein structure using the rigid.inp in CNS(Version 1.21), an error occured saying that the asymmetric map unit is incompatible with the symmetry operator.The model I used was first refined by the Rafmac5.I had met the same problem when I tried to add water molecues using the waterpick.inp program. %XMDOAS3 error encountered: Asymmetric map unit is incompatible with symmetry operators. (CNS is in mode: SET ABORT=NORMal END) * ABORT mode will terminate program execution. * Thank you very much for your suggestions! Best Janxon Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
[ccp4bb] openmp version of CNS 1.21
An openmp multi-processor version of CNS 1.21 is now available at the CNS download site (see http://cns-online.org/cns_request/ to obtain downloading instructions). This parallelized version of CNS 1.21 contains openmp directives (courtesy Kay Diederichs, Universität Konstanz) that enable parallel computations for some steps (especially FFTs) in the crystallographic refinement process on certain linux and mac os x platforms using the ifort compiler. By default, the program will utilize the existing number of processors on your system for the portions of the code that are parallelized. In order to restrict the number of processors to be utilized set the environmental variable OMP_NUM_THREADS to the number of processors you want to use. I have seen performance increases by a factor 2X for medium/large protein structure refinements (using task file refine.inp) on 2 x Dual-Core Intel Xeon processors. This parallelized version has undergone some testing, but in case you run into any problems you should compare the results to the single processor version. Note, that the parallel versions uses somewhat different grid sizes for the electron density grid, so there will be some small differences between the single and multi- processor versions. The only changes compared to the single processor version of CNS 1.21 are in a few source files, the Makefile compilation files, the test output files, and the executable cns file. Axel Brunger Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
[ccp4bb] Version 1.21 of the Crystallography and NMR System (CNS)
CNS 1.21 is now available. A particular highlight is automatic dimensioning of torsion angle dynamics arrays (e.g., MAXTREE), so no user intervention is needed anymore. There are many bug fixes in source code, modules, and input files. I would like to thank Joe Krahn, Ben Eisenbraun, and many others for helpful comments, bug reports, and extensive testing of the new version. == Announcing version 1.21 (general release) of the software: Crystallography NMR System (CNS) (copyright 1997-2008, Yale University) == Information about the software and instructions for downloading the most recent version are available at: ** http://cns-online.org ** The software is available for download, free-of-charge, by all academic (non-profit) users. -- Installation instructions and documentation can be found in: $CNS_SOLVE/doc/html/cns_solve.html once you have downloaded and installed the software. -- Please cite the following references for CNS in publications: 1. Brunger, A.T., Adams, P.D., Clore, G.M., Delano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice, L.M., Simonson, T., Warren, G.L. Crystallography NMR system: A new software system for macromolecular structure determination, Acta Cryst. D54, 905-921 (1998). 2. Brunger, A.T. Version 1.2 of the Crystallography and NMR System. Nature Protocols 2, 2728-2733 (2007). Please cite additional original papers when using specific methods. -- Do not distribute CNS to third parties without approval. By downloading the software you agree to the License in the FTP directory. -- While this new version is compatible with version 1.2 input files it is highly recommended that you use the new input files and modules in conjunction this new version of CNS. The http://cns-online.org website is now pointing to the new 1.21 version by default. -- Attached release notes for CNS version 1.21: === = = Crystallography NMR System = = A.T.Brunger, P.D.Adams, G.M.Clore, W.L.Delano, P.Gros, = R.W.Grosse-Kunstleve, J.-S.Jiang, J.Kuszewski, M.Nilges, = N.S.Pannu, R.J.Read, L.M.Rice, T.Simonson, G.L.Warren = = Copyright (c) 1997-2008 Yale University = === Program: CNS Version: 1.21 Patch level: 1 Status: general release Changes for version 1.21 Program: - multiple changes to source/ dtorsion_top.f 1 removed disfunctional GROUp option 2 automatic dimensioning for MAXTREE, MAXLEN, MAXCHN 3 automatic dimensioning for MAXBND 4 automatic dimensioning for MAXJNT 5 automatic dimensioning for MAXDIHE 6 removed dysfunctional GROUp option 7 updated helplib/cns-dynamics-torsion-topology 8 old line 2135 -- No close quote at the end: changed to: WRITE(6,'(17A)') @ '%atoms ', SEGID(OUTATM),'-',RESID(OUTATM),'-', @ RES(OUTATM),'-',TYPE(OUTATM),' and ', @ SEGID(JB(JBONDS(CONGRP(THISONE,3-I),J))),'-', @ RESID(JB(JBONDS(CONGRP(THISONE,3-I),J))),'-', @ RES(JB(JBONDS(CONGRP(THISONE,3-I),J))),'-', @ TYPE(JB(JBONDS(CONGRP(THISONE,3-I),J))),'' also following statement. 9 old line 3081:redefined IND(3) to INTEGER 10 fixed bug in routine TORMD when the majority of the molecule is fixed, changed to: HPIBXCL=ALLHP(INTEG4(2*NBOND)) HPJBXCL=ALLHP(INTEG4(2*NBOND)) ... CALL FREHP(HPJBXCL,INTEG4(2*NBOND)) CALL FREHP(HPIBXCL,INTEG4(2*NBOND)) - source/noe.f and noe.inc: included DEN function (ongoing development) - modifications in instlib/machine/unsupported/mac-intel-darwin/ machine_f.f: 1 replaced call to INDEX(FILE,' ') with call to TRIMM(FILE,,,). 2 fixed last argument in call to LINSUB in subroutine SYTASK 3 removed leading UNIX pathname in file INQUIRE - modifications in instlib/machine/unsupported/mac-intel-darwin/ machine_c.c: removed rename_() function. - copied these machine_f.f and machine_c.c files to instlib/machine/supported/linux/ instlib/machine/unsupported/intel-x86_64bit-linux/. instlib/machine/unsupported/intel-itanium-linux instlib/machine/unsupported/g77-linux instlib/machine/unsupported/mac-ppc-linux instlib/machine/unsupported/mac-ppc-darwin
Re: [ccp4bb] Stack trace terminated abnormally. cns_solve on macosx leopard
There is an operating system bug in mac osx 10.5.x that may cause this problem if you are running the program with input redirection, but without output redirection, i.e., cns test.inp and the input file contains one ore more blank lines. try: cns test.inp test.out Axel Brunger On May 21, 2008, at 8:37 PM, xiaoyazi2008 wrote: Hi, Recently I've installed the CNS 1.2 on Mac Pro successfully. But when I run cns_solve to do annealing refinement, the following error information appear: _ FFT3C: Using FFTPACK4.1 CNSsolve{+ file: anneal.inp +} CNSsolve{+ directory: xtal_refine +} CNSsolve{+ description: Crystallographic simulated annealing refinement +} CNSsolve{+ authors: Axel T. Brunger, Luke M. Rice and Paul D. Adams +} CNSsolve{+ copyright: Yale University +} forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLineSource cns_solve 003548DF Unknown Unknown Unknown cns_solve 00339FBB Unknown Unknown Unknown cns_solve 000F62EC Unknown Unknown Unknown Stack trace terminated abnormally. Can anybody help me to fix that problem? Thanks a lot [EMAIL PROTECTED] On May 1, 2008, at 3:33 AM, hari jayaram wrote: Hi Since I am not on the cnsbb yet I am posting this here. I downloaded the cns 1.2.2 intel build and was trying to run a simulated annealing refinement on my macbook pro ( Intel) running 10.5.2 However the annealing job crashes roughly 40 minutes into the refinement with the following message There is not enough memory available to the program. This may be because of too little physical memory (RAM) or too little swap space on the machine. It could also be the result of user or system limits. On most Unix systems the limit command can be used to check the current user limits. Please check that the datasize, memoryuse and vmemoryuse limits are set at a large enough value. Unfortunately on Leopard it seems that unlimit and limit are not available under bash Further when I use csh , I get the following values for the limits [mango:~/aps_04_21_2008/p10_2] hari% limit cputime unlimited filesize unlimited datasize 6144 kbytes stacksize8192 kbytes coredumpsize 0 kbytes memoryuseunlimited descriptors 256 memorylocked unlimited maxproc 266 In the same csh shell unlimit returns [mango:~/aps_04_21_2008/p10_2] hari% unlimit unlimit: descriptors: Can't remove limit (Invalid argument) How can I setup cns to have free reign and use up unlimited datasize and stacksize for all cns jobs? Thanks for your help in advance Hari Jayaram The detailed error is posted below ASSFIL: file /Users/hari/cns/cns_solve_1.2/libraries/toppar/ torsionmdmods opened. MESSage=NORM EVALUATE: symbol $MESSAGE_OLD_TMOD set to NORM (string) ECHO=FALSe {OFF} EVALUATE: symbol $ECHO_OLD_TMOD set to FALSE (logical) NEXTCD: condition evaluated as false Program version= 1.2 File version= 1.2 SELRPN: 0 atoms have been selected out of 2380 cns_solve(93676) malloc: *** mmap(size=300512) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug ALLHP: request for -1294967296 bytes - There is not enough memory available to the program. This may be because of too little physical memory (RAM) or too little swap space on the machine. It could also be the result of user or system limits. On most Unix systems the limit command can be used to check the current user limits. Please check that the datasize, memoryuse and vmemoryuse limits are set at a large enough value. - %ALLHP error encountered: not enough memory available (CNS is in mode: SET ABORT=NORMal END) * ABORT mode will terminate program execution. * Program will stop immediately. Maximum dynamic memory allocation: 139649464 bytes Maximum dynamic memory overhead: 944 bytes Program started at: 14:51:17 on 30-Apr-2008 Program stopped at: 15:09:16 on 30-Apr-2008 CPU time used:1077.7678 seconds Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] density_modify.inp is not running in cns1.2 but runs in cns1.1
I certainly have not experienced this problem with our pre-compiled Intel version for mac os X that is available when downloading CNS 1.2 from the CNS website. Axel On Apr 23, 2008, at 7:59 AM, William Scott wrote: Could one or both of you try this with the pre-compiled intel version that comes from the CNS website? If it works on that, it might be something introduced in the compiling process for the osx- optimized version made by David. Thanks. Bill On Apr 23, 2008, at 7:26 AM, Sean Johnson wrote: Raja, I don't have an answer for you, but I have been experiencing the same problem. My guess is that a bug was introduced into version 1.2. My workaround is to use DM. I haven't had problems with any of the other cns scripts in version 1.2. Sean Raja Dey wrote: Hi All, I am trying to run density_modify.inp in cns 1.2. it stops with the error enclosed below. But, it runs with cns 1.1 perfectly. I am using the same two data files(e.g. p65_se_rdey.hkl.cv and combine.hkl). Does anyone have experience like this? Any solution is well appreciated. Regards... Raja ANOMalous=TRUE {ON} Program version= 1.2 File version= 1.2 %XMAPASU-AUTOmem: increasing memory allocation to200 Minimum brick that covers asymmetric unit: A= 0,...,64 B= 0,...,64 C= 0,...,25 Sum of 5287 elements = 5287. SHOW: average of102400 elements= 0. ANOMalous=TRUE {ON} XMPST: average = 0. minimum = 0. maximum = 0. XMPST: r.m.s. = 0. norm = 0. XMHISTO: (default from map) RHOMIN and RHOMAX = 0. 0. %XMHISTO-ERR: a complete flat map. XMHISTO: (default from map) SLOT width = 0.00 %XMHISTO-ERR: the SLOT width is too small 0.0 XMHISTO: the number of slots MBINS=1 and width SLOT= 0. %XDOTYPE-ERR: Variable/type mismatch: do (masksol=1) (real(automap) = $cutoff) ^ %DO-ERR: Data type mismatch. Selection must be a logical expression.: do (masksol=1) (real(automap) = $cutoff) ^ %SHOW error encountered: There were errors in DO expression. (CNS is in mode: SET ABORT=NORMal END) * ABORT mode will terminate program execution. * Program will stop immediately. Maximum dynamic memory allocation:18336916 bytes Maximum dynamic memory overhead: 360 bytes Program started at : 18:01:44 on 21-Apr-2008 Program stopped at : 18:01:46 on 21-Apr-2008 CPU time used : 1.9700 seconds Total runtime : 0. seconds Program compiled by David Gohara Regards... Raja Meet people who discuss and share your passions. Join them now. http://in.rd.yahoo.com/tagline_groups_7/*http://in.promos.yahoo.com/groups/bestofyahoo/ -- Sean Johnson, PhD R. Gaurth Hansen Assistant Professor Utah State University Department of Chemistry and Biochemistry 0300 Old Main Hill Logan, UT 84322-0300 (435) 797-2089 (435) 797-3390 (fax) [EMAIL PROTECTED] Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atbweb.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] an over refined structure
In such cases, we always define the test set first in the high-symmetry space group choice. Then, if it is warranted to lower the crystallographic symmetry and replace with NCS symmetry, we expand the test set to the lower symmetry space group. In other words, the test set itself will be invariant upon applying any of the crystallographic or NCS operators, so will be maximally free in these cases. It is then also possible to directly compare the free R between the high and low crystallographic space group choices. Our recent Neuroligin structure is such an example (Arac et al., Neuron 56, 992-, 2007). Axel On Feb 8, 2008, at 10:48 AM, Ronald E Stenkamp wrote: I've looked at about 10 cases where structures have been refined in lower symmetry space groups. When you make the NCS operators into crystallographic operators, you don't change the refinement much, at least in terms of structural changes. That's the case whether NCS restraints have been applied or not. In the cases I've re-done, changing the refinement program and dealing with test set choices makes some difference in the R and Rfree values. One effect of changing the space group is whether you realize the copies of the molecule in the lower symmetry asymmetric unit are identical or not. (Where identical means crystallographically identical, i.e., in the same packing environments, subject to all the caveats about accuracy, precision, thermal motion, etc). Another effect of going to higher symmetry space groups of course has to do with explaining the experimental data with simpler and smaller mathematical models (Occam's razor or the Principle of Parsimony). Ron Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atb.slac.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] an over refined structure
A few comments that you might find useful: 1. yes, even if you don't apply NCS restraints/constraints there will be correlations between reflections in cases of NCS symmetry or pseudo-crystallographic NCS symmetry. 2. Fabiola, Chapman, et al., published a very nice paper on the topic in Acta D. 62, 227-238, 2006. 3. From my experience, the effects for low NCS symmetry are usually small, except cases of pseudo-symmetry which can be easily addressed by defining the test set in the high-symmetry setting. For high NCS symmetry, the effects are more significant, but then the structure is usually much better determined, anyway, due to averaging. 4. At least the first one of the mentioned MsbA and EmrE structures had a very high Rfree in the absence of multi-copy refinement ( ~ 45%)! So, the Rfree indicated that there was a major problem. 5. The Rfree should vary relatively little among test sets (see my Acta D 49, 24-36, 1993 paper) - if there are large variations for different test set choices then the test set may be too small or there may be systematic problems with some of the reflections causing them to dominate the R factors (outliers at low resolution, for example). Axel Brunger On Feb 7, 2008, at 9:57 AM, Dean Madden wrote: Hi Dirk, I disagree with your final sentence. Even if you don't apply NCS restraints/constraints during refinement, there is a serious risk of NCS contaminating your Rfree. Consider the limiting case in which the NCS is produced simply by working in an artificially low symmetry space-group (e.g. P1, when the true symmetry is P2): in this case, putting one symmetry mate in the Rfree set, and one in the Rwork set will guarantee that Rfree tracks Rwork. The same effect applies to a large extent even if the NCS is not crystallographic. Bottom line: thin shells are not a perfect solution, but if NCS is present, choosing the free set randomly is *never* a better choice, and almost always significantly worse. Together with multicopy refinement, randomly chosen test sets were almost certainly a major contributor to the spuriously good Rfree values associated with the retracted MsbA and EmrE structures. Best wishes, Dean Dirk Kostrewa wrote: Dear CCP4ers, I'm not convinced, that thin shells are sufficient: I think, in principle, one should omit thick shells (greater than the diameter of the G-function of the molecule/assembly that is used to describe NCS-interactions in reciprocal space), and use the inner thin layer of these thick shells, because only those should be completely independent of any working set reflections. But this would be too expensive given the low number of observed reflections that one usually has ... However, if you don't apply NCS restraints/constraints, there is no need for any such precautions. Best regards, Dirk. Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf: It is important when using NCS that the Rfree reflections be selected is distributed thin resolution shells. That way application of NCS should not mix Rwork and Rfree sets. Normal random selection or Rfree + NCS (especially 4x or higher) will drive Rfree down unfairly. Doug Ohlendorf -Original Message- From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of Eleanor Dodson Sent: Tuesday, February 05, 2008 3:38 AM To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] an over refined structure I agree that the difference in Rwork to Rfree is quite acceptable at your resolution. You cannot/ should not use Rfactors as a criteria for structure correctness. As Ian points out - choosing a different Rfree set of reflections can change Rfree a good deal. certain NCS operators can relate reflections exactly making it hard to get a truly independent Free R set, and there are other reasons to make it a blunt edged tool. The map is the best validator - are there blobs still not fitted? (maybe side chains you have placed wrongly..) Are there many positive or negative peaks in the difference map? How well does the NCS match the 2 molecules? etc etc. Eleanor George M. Sheldrick wrote: Dear Sun, If we take Ian's formula for the ratio of R(free) to R(work) from his paper Acta D56 (2000) 442-450 and make some reasonable approximations, we can reformulate it as: R(free)/R(work) = sqrt[(1+Q)/(1-Q)] with Q = 0.025pd^3(1-s) where s is the fractional solvent content, d is the resolution, p is the effective number of parameters refined per atom after allowing for the restraints applied, d^3 means d cubed and sqrt means square root. The difficult number to estimate is p. It would be 4 for an isotropic refinement without any restraints. I guess that p=1.5 might be an appropriate value for a typical protein refinement (giving an R-factor ratio of about 1.4 for s=0.6 and d=2.8). In that case, your R- factor ratio of 0.277/0.215 = 1.29 is well within
Re: [ccp4bb] low-res cutoff in refinement
Are you using CNS 1.2? This version has a robust bulk solvent model and anisotropic correction that is much improved compared to CNS 1.1. It is similarly robust as that of Phenix (although different in detail). Axel Brunger Van Den Berg, Bert wrote: Hi all, during refinement of our (membrane protein) structures, basically in all cases the R/Rfree values depend a lot on the low resolution cutoff. Putting the cutoff at lower res (20-50 A) results in substantially higher R/Rfree values (sometimes few percent). For this reason we mostly refine the data from the high-res limit down to 10A or so. I have noticed that this occurs fairly often in the literature, but I don't know if this is a membrane protein related issue or not. Could it be that the bulk solvent model used in CNS (we refine exclusively with CNS) does not model the situation with membrane proteins, due to the presence of detergents? Or is it related to data collection issues (low-res spots overloaded etc)? Anything else? What could be done to overcome the problem, and to use all the data in refinement? Thanks, Bert Bert van den Berg University of Massachusetts Medical School Program in Molecular Medicine Biotech II, 373 Plantation Street, Suite 115 -- Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atb.slac.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] low-res cutoff in refinement
Bert, That explains it. We observed similar behavior to what you described with the previous version of CNS (1.1).Version 1.2 should hopefully fix your problem. See my paper in Nature Protocols *(* /Nature Protocols/ *2*, 2728-2733 (2007)), for details. Axel Van Den Berg, Bert wrote: No, we have been using version 1.1 so far. Thanks for the suggestion, we'll use version 1.2 from now on and try Phenix as well. Bert -Original Message- From: Axel Brunger [mailto:[EMAIL PROTECTED] Sent: Thu 1/24/2008 7:35 PM To: Van Den Berg, Bert Cc: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] low-res cutoff in refinement Are you using CNS 1.2? This version has a robust bulk solvent model and anisotropic correction that is much improved compared to CNS 1.1. It is similarly robust as that of Phenix (although different in detail). Axel Brunger Van Den Berg, Bert wrote: Hi all, during refinement of our (membrane protein) structures, basically in all cases the R/Rfree values depend a lot on the low resolution cutoff. Putting the cutoff at lower res (20-50 A) results in substantially higher R/Rfree values (sometimes few percent). For this reason we mostly refine the data from the high-res limit down to 10A or so. I have noticed that this occurs fairly often in the literature, but I don't know if this is a membrane protein related issue or not. Could it be that the bulk solvent model used in CNS (we refine exclusively with CNS) does not model the situation with membrane proteins, due to the presence of detergents? Or is it related to data collection issues (low-res spots overloaded etc)? Anything else? What could be done to overcome the problem, and to use all the data in refinement? Thanks, Bert Bert van den Berg University of Massachusetts Medical School Program in Molecular Medicine Biotech II, 373 Plantation Street, Suite 115 -- Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atb.slac.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463 -- Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atb.slac.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463
Re: [ccp4bb] Extraction of test set from CNS reflection file
Actually, the merge script will create the superset of both reflection data sets, so it won't delete reflections from one data set that are not present in the other data set. Once you have done the merge.inp script, you can use the make_cv script to extend your test set while keeping your original test set selection. If you continue to have trouble, please contact me directly with more details. Axel Brunger Florian Brückner wrote: Dear colleagues, I have a dataset form a protein-ligand complex. I want to use the same Rfree test set for refinement as used for the protein alone. How can I extract the test set from the CNS reflection file of the protein-alone dataset (created with make_cv) to use it with the protein-ligand data. I tried CNS program merge but obviouly reflections which are missing in the protein-alone reflection file (because they were rejected during scaling) are also eliminated in the merged output file. This way I am loosing some of the data. Does anyone have a better idea? Thanks, Florian. -- Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor of Molecular and Cellular Physiology Stanford University Web:http://atb.slac.stanford.edu Email: [EMAIL PROTECTED] Phone: +1 650-736-1031 Fax:+1 650-745-1463