Re: [ccp4bb] Resolution, R factors and data quality
Based on the simulations I've done the data should be cut at CC1/2 = 0. Seriously. Problem is figuring out where it hits zero. But the real objective is – where do data stop making an improvement to the model. The categorical statement that all data is good is simply not true in practice. It is probably specific to each data set refinement, and as long as we do not always run paired refinement ala KD or similar in order to find out where that point is, the yearning for a simple number will not stop (although I believe automation will make the KD approach or similar eventually routine). As for the resolution of the structure I'd say call that where |Fo-Fc| (error in the map) becomes comparable to Sigma(Fo). This is I/Sigma = 2.5 if Rcryst is 20%. That is: |Fo-Fc| / Fo = 0.2, which implies |Io-Ic|/Io = 0.4 or Io/|Io-Ic| = Io/sigma(Io) = 2.5. Makes sense to me... As long as it is understood that this ‘model resolution value’ derived via your argument from I/sigI is not the same as a I/sigI data cutoff (and that Rcryst and Rmerge have nothing in common)…. -James Holton MAD Scientist Best, BR On Aug 27, 2013, at 5:29 PM, Jim Pflugrath mailto:jim.pflugr...@rigaku.com jim.pflugr...@rigaku.com wrote: I have to ask flamingly: So what about CC1/2 and CC*? Did we not replace an arbitrary resolution cut-off based on a value of Rmerge with an arbitrary resolution cut-off based on a value of Rmeas already? And now we are going to replace that with an arbitrary resolution cut-off based on a value of CC* or is it CC1/2? I am asked often: What value of CC1/2 should I cut my resolution at? What should I tell my students? I've got a course coming up and I am sure they will ask me again. Jim _ From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Arka Chakraborty [arko.chakrabort...@gmail.com] Sent: Tuesday, August 27, 2013 7:45 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Resolution, R factors and data quality Hi all, does this not again bring up the still prevailing adherence to R factors and not a shift to correlation coefficients ( CC1/2 and CC*) ? (as Dr. Phil Evans has indicated).? The way we look at data quality ( by we I mean the end users ) needs to be altered, I guess. best, Arka Chakraborty On Tue, Aug 27, 2013 at 9:50 AM, Phil Evans p...@mrc-lmb.cam.ac.uk wrote: The question you should ask yourself is why would omitting data improve my model? Phil
Re: [ccp4bb] Resolution, R factors and data quality
We don't currently have a really good measure of that point where adding the extra shell of data adds significant information (whatever that means. However, my rough trials (see http://www.ncbi.nlm.nih.gov/pubmed/23793146) suggested that the exact cutoff point was not very critical, presumably as the information content fades out slowly, so it probably isn't something to agonise over too much. K D's paired refinement may be useful though. I would again caution against looking too hard at CC* rather than CC1/2: they are exactly equivalent, but CC* changes very rapidly at small values, which may be misleading. The purpose of CC* is for comparison with CCcryst (i.e. Fo to Fc). I would remind any users of Scala who want to look back at old log files to see the statistics for the outer shell at the cutoff they used, that CC1/2 has been calculated in Scala for many years under the name CC_IMEAN. It's now called CC1/2 in Aimless (and Scala) following Kai's excellent suggestion. Phil On 28 Aug 2013, at 08:21, Bernhard Rupp hofkristall...@gmail.com wrote: Based on the simulations I've done the data should be cut at CC1/2 = 0. Seriously. Problem is figuring out where it hits zero. But the real objective is – where do data stop making an improvement to the model. The categorical statement that all data is good is simply not true in practice. It is probably specific to each data set refinement, and as long as we do not always run paired refinement ala KD or similar in order to find out where that point is, the yearning for a simple number will not stop (although I believe automation will make the KD approach or similar eventually routine). As for the resolution of the structure I'd say call that where |Fo-Fc| (error in the map) becomes comparable to Sigma(Fo). This is I/Sigma = 2.5 if Rcryst is 20%. That is: |Fo-Fc| / Fo = 0.2, which implies |Io-Ic|/Io = 0.4 or Io/|Io-Ic| = Io/sigma(Io) = 2.5. Makes sense to me... As long as it is understood that this ‘model resolution value’ derived via your argument from I/sigI is not the same as a I/sigI data cutoff (and that Rcryst and Rmerge have nothing in common)…. -James Holton MAD Scientist Best, BR On Aug 27, 2013, at 5:29 PM, Jim Pflugrath jim.pflugr...@rigaku.com wrote: I have to ask flamingly: So what about CC1/2 and CC*? Did we not replace an arbitrary resolution cut-off based on a value of Rmerge with an arbitrary resolution cut-off based on a value of Rmeas already? And now we are going to replace that with an arbitrary resolution cut-off based on a value of CC* or is it CC1/2? I am asked often: What value of CC1/2 should I cut my resolution at? What should I tell my students? I've got a course coming up and I am sure they will ask me again. Jim From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Arka Chakraborty [arko.chakrabort...@gmail.com] Sent: Tuesday, August 27, 2013 7:45 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Resolution, R factors and data quality Hi all, does this not again bring up the still prevailing adherence to R factors and not a shift to correlation coefficients ( CC1/2 and CC*) ? (as Dr. Phil Evans has indicated).? The way we look at data quality ( by we I mean the end users ) needs to be altered, I guess. best, Arka Chakraborty On Tue, Aug 27, 2013 at 9:50 AM, Phil Evans p...@mrc-lmb.cam.ac.uk wrote: The question you should ask yourself is why would omitting data improve my model? Phil
Re: [ccp4bb] Resolution, R factors and data quality
Hi all, If I am not wrong, the Karplus Diederich paper suggests that data is generally meaningful upto CC1/2 value of 0.20 but they suggest a paired refinement technique ( pretty easy to perform) to actually decide on the resolution at which to cut the data. This will be the most prudent thing to do I guess and not follow any arbitrary value, as each data-set is different. But the fact remains that even where I/sigma(I) falls to 0.5 useful information remains which will improve the quality of the maps, and when discarded just leads us a bit further away from truth. However, as always, Dr Diederich and Karplus will be the best persons to comment on that ( as they have already done in the paper :) ) best, Arka Chakraborty p.s. Aimless seems to suggest a resolution limit bases on CC1/2=0.5 criterion ( which I guess is done to be on the safe side- Dr. Phil Evans can explain if there are other or an entirely different reason to it! ). But if we want to squeeze the most from our data-set, I guess we need to push a bit further sometimes :) On Wed, Aug 28, 2013 at 9:21 AM, Bernhard Rupp hofkristall...@gmail.comwrote: **Based on the simulations I've done the data should be cut at CC1/2 = 0. Seriously. Problem is figuring out where it hits zero. ** ** But the real objective is – where do data stop making an improvement to the model. The categorical statement that all data is good is simply not true in practice. It is probably specific to each data set refinement, and as long as we do not always run paired refinement ala KD** ** or similar in order to find out where that point is, the yearning for a simple number will not stop (although I believe automation will make the KD approach or similar eventually routine). ** ** As for the resolution of the structure I'd say call that where |Fo-Fc| (error in the map) becomes comparable to Sigma(Fo). This is I/Sigma = 2.5 if Rcryst is 20%. That is: |Fo-Fc| / Fo = 0.2, which implies |Io-Ic|/Io = 0.4 or Io/|Io-Ic| = Io/sigma(Io) = 2.5. ** ** Makes sense to me... ** ** As long as it is understood that this ‘model resolution value’ derived via your argument from I/sigI is not the same as a I/sigI data cutoff (and that Rcryst and Rmerge have nothing in common)…. ** ** -James Holton MAD Scientist ** ** Best, BR ** ** ** ** On Aug 27, 2013, at 5:29 PM, Jim Pflugrath jim.pflugr...@rigaku.com wrote: I have to ask flamingly: So what about CC1/2 and CC*? ** ** Did we not replace an arbitrary resolution cut-off based on a value of Rmerge with an arbitrary resolution cut-off based on a value of Rmeas already? And now we are going to replace that with an arbitrary resolution cut-off based on a value of CC* or is it CC1/2? ** ** I am asked often: What value of CC1/2 should I cut my resolution at? What should I tell my students? I've got a course coming up and I am sure they will ask me again. ** ** Jim ** ** -- *From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Arka Chakraborty [arko.chakrabort...@gmail.com] *Sent:* Tuesday, August 27, 2013 7:45 AM *To:* CCP4BB@JISCMAIL.AC.UK *Subject:* Re: [ccp4bb] Resolution, R factors and data quality Hi all, does this not again bring up the still prevailing adherence to R factors and not a shift to correlation coefficients ( CC1/2 and CC*) ? (as Dr. Phil Evans has indicated).? The way we look at data quality ( by we I mean the end users ) needs to be altered, I guess. best, ** ** Arka Chakraborty ** ** On Tue, Aug 27, 2013 at 9:50 AM, Phil Evans p...@mrc-lmb.cam.ac.uk wrote: The question you should ask yourself is why would omitting data improve my model? Phil -- *Arka Chakraborty* *ibmb (Institut de Biologia Molecular de Barcelona)** **BARCELONA, SPAIN** *
Re: [ccp4bb] Resolution, R factors and data quality
Aimless does indeed calculate the point at which CC1/2 falls below 0.5 but I would not necessarily suggest that as the best cutoff point. Personally I would also look at I/sigI, anisotropy and completeness, but as I said at that point I don't think it makes a huge difference Phil On 28 Aug 2013, at 10:00, Arka Chakraborty arko.chakrabort...@gmail.com wrote: Hi all, If I am not wrong, the Karplus Diederich paper suggests that data is generally meaningful upto CC1/2 value of 0.20 but they suggest a paired refinement technique ( pretty easy to perform) to actually decide on the resolution at which to cut the data. This will be the most prudent thing to do I guess and not follow any arbitrary value, as each data-set is different. But the fact remains that even where I/sigma(I) falls to 0.5 useful information remains which will improve the quality of the maps, and when discarded just leads us a bit further away from truth. However, as always, Dr Diederich and Karplus will be the best persons to comment on that ( as they have already done in the paper :) ) best, Arka Chakraborty p.s. Aimless seems to suggest a resolution limit bases on CC1/2=0.5 criterion ( which I guess is done to be on the safe side- Dr. Phil Evans can explain if there are other or an entirely different reason to it! ). But if we want to squeeze the most from our data-set, I guess we need to push a bit further sometimes :) On Wed, Aug 28, 2013 at 9:21 AM, Bernhard Rupp hofkristall...@gmail.com wrote: Based on the simulations I've done the data should be cut at CC1/2 = 0. Seriously. Problem is figuring out where it hits zero. But the real objective is – where do data stop making an improvement to the model. The categorical statement that all data is good is simply not true in practice. It is probably specific to each data set refinement, and as long as we do not always run paired refinement ala KD or similar in order to find out where that point is, the yearning for a simple number will not stop (although I believe automation will make the KD approach or similar eventually routine). As for the resolution of the structure I'd say call that where |Fo-Fc| (error in the map) becomes comparable to Sigma(Fo). This is I/Sigma = 2.5 if Rcryst is 20%. That is: |Fo-Fc| / Fo = 0.2, which implies |Io-Ic|/Io = 0.4 or Io/|Io-Ic| = Io/sigma(Io) = 2.5. Makes sense to me... As long as it is understood that this ‘model resolution value’ derived via your argument from I/sigI is not the same as a I/sigI data cutoff (and that Rcryst and Rmerge have nothing in common)…. -James Holton MAD Scientist Best, BR On Aug 27, 2013, at 5:29 PM, Jim Pflugrath jim.pflugr...@rigaku.com wrote: I have to ask flamingly: So what about CC1/2 and CC*? Did we not replace an arbitrary resolution cut-off based on a value of Rmerge with an arbitrary resolution cut-off based on a value of Rmeas already? And now we are going to replace that with an arbitrary resolution cut-off based on a value of CC* or is it CC1/2? I am asked often: What value of CC1/2 should I cut my resolution at? What should I tell my students? I've got a course coming up and I am sure they will ask me again. Jim From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Arka Chakraborty [arko.chakrabort...@gmail.com] Sent: Tuesday, August 27, 2013 7:45 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Resolution, R factors and data quality Hi all, does this not again bring up the still prevailing adherence to R factors and not a shift to correlation coefficients ( CC1/2 and CC*) ? (as Dr. Phil Evans has indicated).? The way we look at data quality ( by we I mean the end users ) needs to be altered, I guess. best, Arka Chakraborty On Tue, Aug 27, 2013 at 9:50 AM, Phil Evans p...@mrc-lmb.cam.ac.uk wrote: The question you should ask yourself is why would omitting data improve my model? Phil -- Arka Chakraborty ibmb (Institut de Biologia Molecular de Barcelona) BARCELONA, SPAIN
Re: [ccp4bb] Resolution, R factors and data quality
We don't currently have a really good measure of that point where adding the extra shell of data adds significant information so it probably isn't something to agonise over too much. K D's paired refinement may be useful though. That seems to be a correct assessment of the situation and a forceful argument to eliminate the review nonsense of nitpicking on I/sigI values, associated R-merges, and other pseudo-statistics once and for good. We can now, thanks to data deposition, at any time generate or download the maps and the models and judge for ourselves even minute details of local model quality from there. As far as use and interpretation goes, when the model meets the map is where the rubber meets the road. I therefore make the heretic statement that the entire table 1 of data collection statistics, justifiable in pre-deposition times as some means to guess structure quality can go the way of X-ray film and be almost always eliminated from papers. There is nothing really useful in Table 1, and all its data items and more are in the PDB header anyhow. Availability of maps for review and for users is the key point. Cheers, BR
Re: [ccp4bb] Resolution, R factors and data quality
What a statement ! Give reviewers maps, I agree however, what if the reviewer has no clue of these things we call structures ? I think for those people table 1 might still provide some justification. I would argue it should go into the supplement at least. Jürgen Sent from my iPad On Aug 28, 2013, at 5:58, Bernhard Rupp hofkristall...@gmail.com wrote: We don't currently have a really good measure of that point where adding the extra shell of data adds significant information so it probably isn't something to agonise over too much. K D's paired refinement may be useful though. That seems to be a correct assessment of the situation and a forceful argument to eliminate the review nonsense of nitpicking on I/sigI values, associated R-merges, and other pseudo-statistics once and for good. We can now, thanks to data deposition, at any time generate or download the maps and the models and judge for ourselves even minute details of local model quality from there. As far as use and interpretation goes, when the model meets the map is where the rubber meets the road. I therefore make the heretic statement that the entire table 1 of data collection statistics, justifiable in pre-deposition times as some means to guess structure quality can go the way of X-ray film and be almost always eliminated from papers. There is nothing really useful in Table 1, and all its data items and more are in the PDB header anyhow. Availability of maps for review and for users is the key point. Cheers, BR
[ccp4bb] Protein Crystallography course via the web at Birkbeck College
Dear all, registration is currently open for the postgraduate certificate course in Protein Crystallography via the web at Birkbeck that begins on Monday, October the 7th. It is for the duration of 1 year during which all aspects of protein crystallography will be covered from the fundamentals of protein structure to validation. The emphasis is very much on techniques and the underlying principles so is ideally suited to those currently enroled on PhD programs or those who wish to expand their skills in structural biology. Information on registration and course content can be found at: http://px13.cryst.bbk.ac.uk/px/course/course.htm under General information or contact the course director (Tracey Barrett) at p...@mail.cryst.bbk.ac.uk for further details. Although a stand-alone course, the postgraduate certificate in protein crystallography can also be taken as part of the MSc in Structural Molecular Biology. For more information, please see http://www.bbk.ac.uk/study/2013/postgraduate/programmes/TMSBISCL_C/ Dr Tracey Barrett, Crystallography, Senior Lecturer in Structural Biology, Institute for Structural and Molecular Biology, Birkbeck College, Malet Street, London WC1E 7HX Tel: 020 7631 6822 Fax: 020 7631 6803
Re: [ccp4bb] Resolution, R factors and data quality
Hi, a random thought: the data resolution, d_min_actual, can be thought of as such that maximizes the correlation (*) between the synthesis calculated using your data and an equivalent Fmodel synthesis calculated using complete set of Miller indices in d_min_actual-inf resolution range, where d_min=d_min_actual and d_min is the highest resolution of data set in question. Makes sense to me.. (*) or any other more appropriate similarity measure: usual map CC may not be the best one in this context. Pavel On Tue, Aug 27, 2013 at 5:45 AM, Arka Chakraborty arko.chakrabort...@gmail.com wrote: Hi all, does this not again bring up the still prevailing adherence to R factors and not a shift to correlation coefficients ( CC1/2 and CC*) ? (as Dr. Phil Evans has indicated).? The way we look at data quality ( by we I mean the end users ) needs to be altered, I guess. best, Arka Chakraborty On Tue, Aug 27, 2013 at 9:50 AM, Phil Evans p...@mrc-lmb.cam.ac.uk wrote: The question you should ask yourself is why would omitting data improve my model? Phil On 27 Aug 2013, at 02:49, Emily Golden 10417...@student.uwa.edu.au wrote: Hi All, I have collected diffraction images to 1 Angstrom resolution to the edge of the detector and 0.9A to the corner.I collected two sets, one for low resolution reflections and one for high resolution reflections. I get 100% completeness above 1A and 41% completeness in the 0.9A-0.95A shell. However, my Rmerge in the highest shelll is not good, ~80%. The Rfree is 0.17 and Rwork is 0.16 but the maps look very good. If I cut the data to 1 Angstrom the R factors improve but I feel the maps are not as good and I'm not sure if I can justify cutting data. So my question is, should I cut the data to 1Angstrom or should I keep the data I have? Also, taking geometric restraints off during refinement the Rfactors improve marginally, am I justified in doing this at this resolution? Thank you, Emily -- *Arka Chakraborty* *ibmb (Institut de Biologia Molecular de Barcelona)** **BARCELONA, SPAIN** *
[ccp4bb] Quick resolution cutoff survey
Since we keep discussing resolution cutoffs and the benefits of not to include all data etc. I thought I would crowd source your opinion on this particular data set. processed with XDS, here's the XSCALE.LP output: SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE = -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONSCOMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 9.4353651009 1028 98.2% 1.7% 2.1% 5351 68.92 1.8% 100.0* 20.698 691 6.67 101531756 1760 99.8% 2.2% 2.4% 10134 58.30 2.4% 100.0* -120.6741404 5.44 131142217 2223 99.7% 3.4% 3.5% 13097 42.38 3.7%99.9* -100.7161845 4.71 152732583 2592 99.7% 3.2% 3.1% 15259 46.24 3.5%99.9* -140.7332212 4.22 171832907 2934 99.1% 3.2% 3.2% 17173 45.14 3.6%99.9* -160.7222538 3.85 190103183 3217 98.9% 4.4% 4.1% 19000 37.38 4.8%99.9* -160.7172794 3.56 207643441 3473 99.1% 5.9% 5.6% 20752 30.36 6.5%99.9* -130.7543061 3.33 225163681 3712 99.2% 8.8% 8.5% 22507 22.60 9.7%99.7* -110.7373293 3.14 247353963 4001 99.1% 12.4% 13.0% 24725 16.7713.5%99.5*-80.6963565 2.98 259314127 4161 99.2% 17.2% 18.1% 25924 12.8218.7%99.2*-60.7103751 2.84 265214291 4386 97.8% 25.4% 26.9% 264959.2627.6%98.3*-40.6833809 2.72 203573495 4592 76.1% 27.6% 29.3% 202777.9030.2%97.9* 00.7062826 2.61 159172860 4768 60.0% 33.7% 35.0% 158396.4137.0%96.6*-20.6972171 2.52 129492394 4944 48.4% 42.5% 45.1% 128774.9146.8%95.3* 00.6941692 2.43 103101993 5097 39.1% 47.8% 50.7% 102304.0853.0%94.4*-20.6701295 2.3681801693 5309 31.9% 56.4% 60.1% 80793.1263.0%92.2*-20.671 961 2.2960751381 5441 25.4% 69.9% 72.5% 59712.2878.9%87.2* -100.618 643 2.2240011077 5610 19.2% 82.9% 81.9% 38931.7896.0%80.7*-80.633 340 2.162491 799 5771 13.8% 78.0% 83.6% 23761.4792.9%75.9*-40.586 154 2.11 786 367 59016.2% 103.0%106.4% 6660.87 129.9%63.1*120.580 28 total 281631 49217 80920 60.8% 7.1% 7.2% 280625 21.54 7.8%99.9*-70.706 39073 And here's the link so you can voice your opinion in a Survey Monkey. Results from this survey will be reported back to the CCP4bb. http://www.surveymonkey.com/s/YNDKM6G Thanks for your participation and no there's no iPad or iPod-touch to win, and you also don't have to disclose your email. The survey has only two questions, one is just a click the other one you provide your opinion on your decision. Thanks, Jürgen P.S low resolution shell starts at 44 Å - 9.43 P.P.S. the Table1 will be revealed once I report back the outcome of this survey. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://lupo.jhsph.edu
[ccp4bb] Position opening at RCSB PDB/Rutgers University- BIOCHEMICAL INFORMATION ANNOTATION SPECIALIST
The RCSB Protein Data Bank (www.rcsb.org) is a publicly accessible information portal for researchers and students interested in structural biology. At its center is the PDB archive-the sole international repository for the 3-dimensional structure data of biological macromolecules. These structures hold significant promise for the pharmaceutical and biotechnology industries in the search for new drugs and in efforts to understand the mysteries of human disease. The primary mission of the RCSB PDB is to provide accurate, well-annotated data in the most timely and efficient way possible to facilitate new discoveries and scientific advances. The RCSB PDB processes, stores, and disseminates these important data, and develops the software tools needed to assist users in depositing and accessing structural information. The RCSB Protein Data Bank at Rutgers University in Piscataway, NJ has an opening for a Biochemical Information Annotation Specialist to curate and standardize macromolecular structures for distribution in the PDB archive. Annotation Specialists validate, annotate, and release structural entries in PDB archive. Annotation Specialists also communicate daily with members of the deposition community. The position is an academic position with state benefit. The salary is compatible with faculty level. A background in macromolecular crystallography or small molecule crystallography is a strong advantage. Biological chemistry background (PhD, MS) is required. Experience with Linux computer systems and biological databases is preferred. The successful candidate should be self-motivated, pay close attention to details, possess strong written and oral communication skills, and meet deadlines. This position offers the opportunity to participate in an exciting project with significant impact on the scientific community. Please send resume (PDF preferred) to Dr. Jasmine Young at pdbj...@rcsb.rutgers.edu. -- Jasmine Young, Ph.D. RCSB Protein Data Bank Assistant Research Professor Lead biocurator Center for Integrative Proteomics Research Rutgers The State University of New Jersey 174 Frelinghuysen Rd Piscataway, NJ 08854-8087 Email: jas...@rcsb.rutgers.edu Phone: (848)-445-0103 ext 4920 Fax:(732)-445-4320
Re: [ccp4bb] Resolution, R factors and data quality
what if the reviewer has no clue of these things we call structures ? I think for those people table 1 might still provide some justification. Someone who knows little about structures probably won’t appreciate the technical details in Table 1 either J rgen Sent from my iPad On Aug 28, 2013, at 5:58, Bernhard Rupp hofkristall...@gmail.com wrote: We don't currently have a really good measure of that point where adding the extra shell of data adds significant information so it probably isn't something to agonise over too much. K D's paired refinement may be useful though. That seems to be a correct assessment of the situation and a forceful argument to eliminate the review nonsense of nitpicking on I/sigI values, associated R-merges, and other pseudo-statistics once and for good. We can now, thanks to data deposition, at any time generate or download the maps and the models and judge for ourselves even minute details of local model quality from there. As far as use and interpretation goes, when the model meets the map is where the rubber meets the road. I therefore make the heretic statement that the entire table 1 of data collection statistics, justifiable in pre-deposition times as some means to guess structure quality can go the way of X-ray film and be almost always eliminated from papers. There is nothing really useful in Table 1, and all its data items and more are in the PDB header anyhow. Availability of maps for review and for users is the key point. Cheers, BR
[ccp4bb] 'table 1'
Hi all, I wonder when the term 'Table 1' entered Newspeak. I heard students use it rather recently, and to me it sounds derogative, as though they would treat that table as a black box generated by some program and better not look at it. The data statistics are an attempt to describe the quality of the actual data as a result of an experiment. Whether or not this could be done in a better way is not my point (most crystallographers with some experience will draw their conclusions from the statistiscs), but people should realise its importance - everything else in an article is merely interpretation, most of all the model itself (which is not data, as many often confuse), and to a large extend even the electron density map. As I pointed out this is based on my personal impression, based on which I would like to encourage people not to use the term 'Table 1'. Language has an influence on how we think, so language should be kept from too much degradation. All the best, Tim -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A signature.asc Description: Digital signature
Re: [ccp4bb] 'table 1'
I am pleased to hear that Table 1 has finally entered the realm of politically incorrect terms. Let me fire another insult at it: The data statistics are an attempt to describe the quality of the actual data as a result of an experiment. Unfortunately, table 1 does not achieve that objective. The statistics in table 1 are a single global numbers limited to what we believe were the primary Bragg components of the diffraction pattern at the time of data processing. The diffraction experiment is a much more complex, time dependent, etc. process. If you truly care about the experiment, demand raw image deposition. everything else in an article is merely interpretation, most of all the model itself (which is not data, as many often confuse), and to a large extend even the electron density map. I take issue with that (not just politically) incorrect and indiscriminate insult towards electron density. Any SAD or similar experimental map from decent model-independent phases firmly attests to the opposite. which I would like to encourage people not to use the term 'Table 1'. Language has an influence on how we think, so language should be kept from too much degradation. To this wonderful statement I have only one response: i=0 do i=1,10 write (*,*) 'Table 1' i=i+1 end do BR PS: Never thought Table one has so much fictional (and frictional) potential. Just wait for Table 2, refinement statistics. -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A
Re: [ccp4bb] Dependency of theta on n/d in Bragg's law
On 22 August 2013 07:54, James Holton jmhol...@lbl.gov wrote: Well, yes, but that's something of an anachronism. Technically, a Miller index of h,k,l can only be a triplet of prime numbers (Miller, W. (1839). A treatise on crystallography. For J. JJ Deighton.). This is because Miller was trying to explain crystal facets, and facets don't have harmonics. This might be why Bragg decided to put an n in there. But it seems that fairly rapidly after people starting diffracting x-rays off of crystals, the Miller Index became generalized to h,k,l as integers, and we never looked back. Yes but I think it would be a pity if we lost IMO the important distinction in meaning between Miller indices as defined above as co-prime integers and (for want of a better term) reflection indices as found in an MTZ file. For example, Stout Jensen makes a careful distinction between them (as I recall they call reflection indices something like general indices: sorry I don't have my copy of S J to hand to check their exact terminology). The confusion that can arise by referring to reflection indices as Miller indices is well illustrated if you try to explain Bragg's equation to a novice, because the d in the equation (i.e. n lambda = 2d sin[theta]) is the interplanar separation for planes as calculated from their Miller indices, whereas the theta is of course the theta angle as calculated from the corresponding reflection indices. If you say that Miller reflection indices are the same thing you have a hard time explaining the equation! One obvious way out of the dilemma is to drop the n term (so now lambda = 2d sin[theta]) and then redefine d as d/n so the new d is calculated from the same reflection indices as theta, and the Miller indices don't enter into it. But then you have to explain to your novice why you know better than a Nobel prizewinner! As you say Bragg no doubt had a good reason to include the n (i.e. to make the connection between the macroscopic properties of a crystal and its diffraction pattern). Sorry for coming into this discussion somewhat late! Cheers -- Ian
[ccp4bb] ALS Call for General User Proposals - DEADLINE SEPTEMBER 4 , 2013
The deadline for Jan/July 2014 Collaborative Crystallography proposals will be *Sep 4, 2013. * Through the Collaborative Crystallography Program (CC) at the Advanced Light Source (ALS), scientists can send protein crystals to Berkeley Center for Structural Biology (BCSB) staff researchers for data collection and analysis. The CC Program can provide a number of benefits to researchers: * Obtain high quality data and analysis through collaborating with expert beamline researchers; * Rapid turn around on projects; and * Reduced travel costs. To apply, please submit a proposal through the ALS General User proposal review process for beamtime allocation. Proposals are reviewed and ranked by the Proposal Study Panel, and beamtime is allocated accordingly. BCSB staff schedule the CC projects on Beamlines 5.0.1 and 5.0.2 to fit into the available resources. Only non-proprietary projects will be accepted. As a condition of participation, BCSB staff researchers who participate in data collection and/or analysis must be appropriately acknowledged - typically being included as authors on publications and in PDB depositions. Please consult the website for additional information at: http://bcsb.als.lbl.gov/wiki/index.php/Collaborative_Crystallography - How To Apply: Please follow the instructions for proposal submission at: http://www-als.lbl.gov/index.php/user-information/user-guide/58.html Scroll down to *Structural Biology beamlines (includes protein SAXS)* and click on New Proposal. Enter your proposal information. Regards, Banumathi Sankaran
Re: [ccp4bb] Quick resolution cutoff survey
Dear CCP4bb, almost 12h have passed since I posted this question to the board. Since some of us get daily or weekly digests I will hold off with revealing the results. But the replies thus far are really interesting and exciting, and this is without any sarcasm tags. However we have only sampled 1.5% of the CCP4 community thus far. I will try to have an almost complete reply compiled to all questions raised in the comments box, say in about one week. Thanks to all those that participated, Jürgen .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://lupo.jhsph.edu On Aug 28, 2013, at 11:32 AM, Bosch, Juergen wrote: Since we keep discussing resolution cutoffs and the benefits of not to include all data etc. I thought I would crowd source your opinion on this particular data set. processed with XDS, here's the XSCALE.LP output: SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE = -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONSCOMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 9.4353651009 1028 98.2% 1.7% 2.1% 5351 68.92 1.8% 100.0* 20.698 691 6.67 101531756 1760 99.8% 2.2% 2.4% 10134 58.30 2.4% 100.0* -120.6741404 5.44 131142217 2223 99.7% 3.4% 3.5% 13097 42.38 3.7%99.9* -100.7161845 4.71 152732583 2592 99.7% 3.2% 3.1% 15259 46.24 3.5%99.9* -140.7332212 4.22 171832907 2934 99.1% 3.2% 3.2% 17173 45.14 3.6%99.9* -160.7222538 3.85 190103183 3217 98.9% 4.4% 4.1% 19000 37.38 4.8%99.9* -160.7172794 3.56 207643441 3473 99.1% 5.9% 5.6% 20752 30.36 6.5%99.9* -130.7543061 3.33 225163681 3712 99.2% 8.8% 8.5% 22507 22.60 9.7%99.7* -110.7373293 3.14 247353963 4001 99.1% 12.4% 13.0% 24725 16.7713.5%99.5*-80.6963565 2.98 259314127 4161 99.2% 17.2% 18.1% 25924 12.8218.7%99.2*-60.7103751 2.84 265214291 4386 97.8% 25.4% 26.9% 264959.2627.6%98.3*-40.6833809 2.72 203573495 4592 76.1% 27.6% 29.3% 202777.9030.2%97.9* 00.7062826 2.61 159172860 4768 60.0% 33.7% 35.0% 158396.4137.0%96.6*-20.6972171 2.52 129492394 4944 48.4% 42.5% 45.1% 128774.9146.8%95.3* 00.6941692 2.43 103101993 5097 39.1% 47.8% 50.7% 102304.0853.0%94.4*-20.6701295 2.3681801693 5309 31.9% 56.4% 60.1% 80793.1263.0%92.2*-20.671 961 2.2960751381 5441 25.4% 69.9% 72.5% 59712.2878.9%87.2* -100.618 643 2.2240011077 5610 19.2% 82.9% 81.9% 38931.7896.0%80.7*-80.633 340 2.162491 799 5771 13.8% 78.0% 83.6% 23761.4792.9%75.9*-40.586 154 2.11 786 367 59016.2% 103.0%106.4% 6660.87 129.9%63.1*120.580 28 total 281631 49217 80920 60.8% 7.1% 7.2% 280625 21.54 7.8%99.9*-70.706 39073 And here's the link so you can voice your opinion in a Survey Monkey. Results from this survey will be reported back to the CCP4bb. http://www.surveymonkey.com/s/YNDKM6G Thanks for your participation and no there's no iPad or iPod-touch to win, and you also don't have to disclose your email. The survey has only two questions, one is just a click the other one you provide your opinion on your decision. Thanks, Jürgen P.S low resolution shell starts at 44 Å - 9.43 P.P.S. the Table1 will be revealed once I report back the outcome of this survey. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD
Re: [ccp4bb] Resolution, R factors and data quality
Jim, This is coming from someone who just got enlightened a few weeks ago on resolution cut-offs. I am asked often: What value of CC1/2 should I cut my resolution at? The KD paper mentioned that the CC(1/2) criterion loses its significance at ~9 according to student test. I doubt that this can be a generally true guideline for a resolution cut-off. The structures I am doing right now were cut off at ~20 to ~80 CC(1/2) You probably do not want to do the same mistake again, we all made before, when cutting resolution based on Rmerge/Rmeas, do you? What should I tell my students? I've got a course coming up and I am sure they will ask me again. This is actually the more valuable insight I got from the KD paper. You don't use the CC(1/2) as an absolute indicator but rather as an suggestion. The resolution limit is determined by the refinement, not by the data processing. I think I will handle my data in future as follows: Bins with CC(1/2) less than 9 should be initially excluded. The structure is then refined against all reflections in the file and only those bins that add information to the map/structure are kept in the final rounds. In most cases this will probably be more than CC(1/2) 25. If the last shell (CC~9) still adds information to the model, process the images again, e.g. till CC(1/2) drops to 0, and see if some more useful information is in there. You could also go ahead and use CC(1/2) 0 as initial cut-off, but I think that will rather increase computation time than help your structure in most cases. So yes, I would feel comfortable with giving true resolution limits based on the refinement of the model, and not based on any number derived from data processing. In the end, you can always say I tried it and this was the highest resolution I could model vs. I cut at _numerical value X of this parameter_ because everybody else does so.