Dear Andrew, what I meant is "Everything else being equal, bad data result in a larger Rfree-Rwork gap than good data". Of course the gap also depends on the resolution, and other factors.
best wishes, Kay On Mon, 30 Aug 2021 16:59:12 +0100, Andrew Leslie - MRC LMB <and...@mrc-lmb.cam.ac.uk> wrote: >Dear Kay and Jon, > >I cannot remember Ian Tickles original posting on this (assuming that it was >made to the bulletin board), but surely the resolution of the data is also a >very important factor in the danger of over-fitting. The lower the resolution, >the worse the experimental data to refined parameter ratio becomes, and the >more likely it is to obtain an overfitted model, regardless of how accurate >that data might be. Perhaps this is why Kay said “generally I agree that the >accuracy of the data is inversely related to the danger of overfitting”, or >did you have something else in mind Kay? > >Cheers, > >Andrew > >> On 30 Aug 2021, at 13:59, Kay Diederichs <kay.diederi...@uni-konstanz.de> >> wrote: >> >> Hi Jon, >> >> generally I agree that the accuracy of the data is inversely related to the >> danger of overfitting, and that selection in shells is not necessary. >> But if twinning is suspected, as here, and/or a choice between high and low >> symmetry spacegroup has to be made, one has to make sure that >> potentially symmetry-related reflections are _either_ labelled as test _or_ >> as work; a mixture will artificially down-bias Rfree. >> >> Selecting the test set in the highest possible symmetry (which is what >> Phenix does) is a good solution. This test set should be symmetry-expanded >> when trying the low-symmetry spacegroup (if one wants to compare R values). >> The latter is not necessary when working with thin shells - but that has >> other disadvantages. >> >> best wishes, >> Kay >> >> On Sun, 29 Aug 2021 14:32:34 +0000, Jon Cooper <jon.b.coo...@protonmail.com> >> wrote: >> >>> Hello Engin, we discussed this a year or two ago in relation to NCS when >>> Ian Tickle convinced me that since overfitting is due errors in the data, >>> there is no reason to expect these errors to be correlated by NCS and >>> picking the R-free set uniformally or in shells doesn't matter. No doubt my >>> incompetence, but I can't see why twinning would be different. >>> >>> Cheers, Jon.C. >>> >>> Sent from ProtonMail mobile >>> >>> -------- Original Message -------- >>> On 29 Aug 2021, 05:32, Engin Özkan wrote: >>> >>>> Hi, >>>> >>>> I believe this is taken care of automatically if you use phenix to pick >>>> your free reflections. >>>> >>>> From http://www.phenix-online.org/documentation/tutorials/twinning.html >>>> >>>> "When a test set is designed, care must be taken that free and work >>>> reflections are not related by a twin law. The R-free set assignment in >>>> phenix.refine and phenix.reflection_file_converter is designed with this >>>> in mind: the free reflections are chosen to obey the highest possible >>>> symmetry of the lattice." >>>> >>>> I believe this applies to all datasets, just in case there may be twinning. >>>> >>>> Engin >>>> >> ... >> >> ######################################################################## >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 >> >> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing >> list hosted by www.jiscmail.ac.uk, terms & conditions are available at >> https://www.jiscmail.ac.uk/policyandsecurity/ > >######################################################################## > >To unsubscribe from the CCP4BB list, click the following link: >https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > >This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing >list hosted by www.jiscmail.ac.uk, terms & conditions are available at >https://www.jiscmail.ac.uk/policyandsecurity/ ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/