On Tue, 8 Nov 2022 15:25:03 -0800, James Holton <jmhol...@lbl.gov> wrote:
>Thank you Ian for your quick response! > >I suppose what I'm really trying to do is put a p-value on the >"geometry" of a given PDB file. As in: what are the odds the deviations >from ideality of this model are due to chance? > >I am leaning toward the need to take all the deviations in the structure >together as a set, but, as Joao just noted, that it just "feels wrong" >to tolerate a 3-sigma deviate. Even more wrong to tolerate 4 sigma, 5 >sigma. And 6 sigma deviates are really difficult to swallow unless your >have trillions of data points. > >To put it down in equations, is the p-value of a structure with 1000 >bonds in it with one 3-sigma deviate given by: > >a) p = 1-erf(3/sqrt(2)) >or >b) p = 1-erf(3/sqrt(2))**1000 >or >c) something else? > p = 1-erf(3/sqrt(2))**1000 (= 0.933 thus quite likely to happen) is the p-value of a structure with 1000 bonds in it, with one or more deviations > 3-sigma. (the words after the comma differ from how you express it) But keep in mind: one person's outlier may be another person's Nobel Prize! Best, Kay P.S. I used 1-erf(3/sqrt(2))^1000 at wolfram alpha.com for the numerical calculation ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/