Re: [ccp4bb] A challenging MR problem

2022-11-09 Thread Eleanor Dodson
I would start with coot. As bostjan suggests - 1) choose whole chain as a
fragment.
2) fit that over chain 2 and change its label to that of chain 2. (Remember
to make this a a copy of your fragment..)
3) check is there any density for the missing bit?

Then repeat till you have the whole thing.
And then merge all the pieces.

Of course your missing domain may be disordered. The twinning is suspicious.

And are you sure you have the right spacegroup?. 16 copies makes me suspect
there is more symmetry than you are using,






fit whole chain over chain 2.

On Wed, 9 Nov 2022 at 23:12, Bostjan Kobe  wrote:

> Superimposing that molecule on all the others?
>
>
>
> Bostjan
>
>
>
> --
>
> Bostjan Kobe *FAA*
>
> Australian Laureate Fellow
> Professor of Structural Biology
> School of Chemistry and Molecular Biosciences
>
> and Institute for Molecular Bioscience (Division of Chemistry and
> Structural Biology) and Australian Infectious Diseases Research Centre
>
> Cooper Road
> University of Queensland
> Brisbane, Queensland 4072
> Australia
> Phone: +61 7 3365 2132
> Fax: +61 7 3365 4699
> E-mail: b.k...@uq.edu.au
> URL: http://www.scmb.uq.edu.au/staff/bostjan-kobe
> Office: Building 76 Room 329
> Notice: If you receive this e-mail by mistake, please notify me, and do
> not make any use of its contents. I do not waive any privilege,
> confidentiality or copyright associated with it. Unless stated otherwise,
> this e-mail represents only the views of the Sender and not the views of
> The University of Queensland.
>
>
>
>
>
>
>
> *From: *CCP4 bulletin board  on behalf of
> Medhanjali DasGupta 
> *Reply to: *Medhanjali DasGupta 
> *Date: *Thursday, 10 November 2022 at 9:05 am
> *To: *"CCP4BB@JISCMAIL.AC.UK" 
> *Subject: *Re: [ccp4bb] A challenging MR problem
>
>
>
> The data resolution is 2A.
>
> I have 16 chains in my model  out of which only one of the chains has the
> "missing" domain modeled. Is there a way to do MR to predict where the
> missing domains will go in the rest of the chains, based on my
> solved structure?
>
>
>
> Thanks for all the helpful suggestions!!
>
>
>
> M
>
>
>
> On Wed, Nov 9, 2022 at 3:11 PM Eleanor Dodson <
> 176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk> wrote:
>
> Well you could just try the buccaneer pipeline. It would use the phases
> from your solved domain and try to fit the missing sequence. What are your
> twin fractions? And what is the resolution?
>
> Eleanor
>
>
>
> On Wed, 9 Nov 2022 at 21:06, Tim Gruene  wrote:
>
> Dear Medhanjali DasGupta,
> unless the resolution is really poor, the quickest try would be shelxe,
> starting from what you already have. It might work at, say, 2.8A
> resolution or better...
>
> Best,
> Tim
>
> On Wed, 9 Nov 2022 14:34:28 -0600 Medhanjali
> DasGupta  wrote:
>
> > Hello!
> > My protein structure has a missing domain and I am trying to figure
> > out the best way to model this missing domain using the solved
> > (modeled) fixed core domain? My data is also imperfectly twinned,
> > with 4 twin fractions according to refmac5.
> >
> >  Any help/ idea is appreciated!
> >
> >
> >
>
>
>
> --
> --
> Tim Gruene
> Head of the Centre for X-ray Structure Analysis
> Faculty of Chemistry
> University of Vienna
>
> Phone: +43-1-4277-70202
>
> GPG Key ID = A46BEE1A
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
>
>
>
> --
>
> Thanks,
>
> Medhanjali Dasgupta
>
> Postdoctoral Research Scientist
>
> Lawrence Berkeley National Laboratory
>
> [image: Image removed by sender.]
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] outliers

2022-11-09 Thread Karplus, Andy
HI Pavel and all,

I’m not sure if this is what you were thinking of, but we published in 2016 a 
rather dramatic example showing how a series of reliably determined extreme 
phi,psi outliers document how the strain-energy associated with adopting the 
phi,psi angles is distributed between an extreme close approach of atoms and a 
coordinated set of bond angle changes involving their increasing distortion 
from their “standard” values in ways that make sense. The paper is 
here.

While that is a particularly extreme example, it is actually just one example 
of the systematic variations in observed bond angles from their classical 
standard “ideal” values that are occurring throughout phi,psi space. Building 
on earlier work, our main paper documenting the conformation dependence of 
“ideal” geometry using a fairly large set of ultra-high resolution structures 
is here.  
Figure 6 in that paper provides some specific examples of how the bond angle 
variations seen near the edges of “classically allowed” regions make sense in 
terms of the bond angles incurring strain energy as part of relieving what 
would have been a much worse collisional strain energy.

In documenting those trends, we sought to shift our community away from 
thinking that there is a single set of ideal geometry values and instead 
recognize that the expected (or “ideal”) geometry values are strongly 
conformation dependent. While many users may not be aware of it, a restraint 
library based on that concept is now the default library in Phenix (see 
here). We’ve 
similarly shown (also building on earlier work) that reliable outliers also 
exist with regard to peptide bond planarity (as measured by the omega torsion 
angle), and that the expected omega torsion angle also has 
conformation-dependent trends such that the its expected value deviates by up 
to 7 or 8 degrees from planarity even in phi,psi regions that are reasonably 
well populated (see here).

Ultra-high resolution protein structures can achieve a level of precision and 
accuracy that is tremendously valuable for revealing deviations from standard 
geometry that are quite real and helpful for our understanding of fundamental 
principles.

HTH, Andy
[Black Lives Matter]

Dr. P. Andrew Karplus (he, him, his)
Distinguished Professor of Biochemistry and Biophysics
NIGMS GCE4All Research Center Director of 
Communications
2133 ALS Building
Oregon State University
Corvallis, OR 97331
ph. 541-737-3200
andy.karp...@oregonstate.edu

“Revealing how life works for the benefit of all!”
http://biochem.oregonstate.edu/
https://www.facebook.com/OSUBB



From: CCP4 bulletin board  on behalf of Pavel Afonine 

Date: Wednesday, November 9, 2022 at 6:20 PM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] outliers

[This email originated from outside of OSU. Use caution with links and 
attachments.]
This is best illustrated by Ramachandran "outliers",
which are perfectly supported by electron density.

Indeed, and 3NOQ is one of my favorite examples of that, an outlier isn't 
necessarily equates to wrong! However, I think torsion angles (eg, phi/psi) are 
much more flexible than covalent angles/bonds and so they can possibly afford 
larger deviations compared to covalent bonds/angles.

The strain caused by any one of them will distribute itself
over all neighbouring bond lengths and angles as well as
over the torsion angles.

I wonder if there is a documented study that actually shows this happening? 
Clearly this must take place one way or another, but I wonder if anyone 
"measured" the effect and documented it..

Pavel




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] outliers

2022-11-09 Thread Pavel Afonine
>
> This is best illustrated by Ramachandran "outliers",
> which are perfectly supported by electron density.
>

Indeed, and 3NOQ is one of my favorite examples of that, an outlier isn't
necessarily equates to wrong! However, I think torsion angles (eg, phi/psi)
are much more flexible than covalent angles/bonds and so they can possibly
afford larger deviations compared to covalent bonds/angles.

The strain caused by any one of them will distribute itself
> over all neighbouring bond lengths and angles as well as
> over the torsion angles.
>

I wonder if there is a documented study that actually shows this happening?
Clearly this must take place one way or another, but I wonder if anyone
"measured" the effect and documented it..

Pavel



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] A challenging MR problem

2022-11-09 Thread Bostjan Kobe
Superimposing that molecule on all the others?

Bostjan

--
Bostjan Kobe FAA
Australian Laureate Fellow
Professor of Structural Biology
School of Chemistry and Molecular Biosciences
and Institute for Molecular Bioscience (Division of Chemistry and Structural 
Biology) and Australian Infectious Diseases Research Centre
Cooper Road
University of Queensland
Brisbane, Queensland 4072
Australia
Phone: +61 7 3365 2132
Fax: +61 7 3365 4699
E-mail: 
b.k...@uq.edu.au
URL: http://www.scmb.uq.edu.au/staff/bostjan-kobe
Office: Building 76 Room 329
Notice: If you receive this e-mail by mistake, please notify me, and do not 
make any use of its contents. I do not waive any privilege, confidentiality or 
copyright associated with it. Unless stated otherwise, this e-mail represents 
only the views of the Sender and not the views of The University of Queensland.



From: CCP4 bulletin board  on behalf of Medhanjali 
DasGupta 
Reply to: Medhanjali DasGupta 
Date: Thursday, 10 November 2022 at 9:05 am
To: "CCP4BB@JISCMAIL.AC.UK" 
Subject: Re: [ccp4bb] A challenging MR problem

The data resolution is 2A.
I have 16 chains in my model  out of which only one of the chains has the 
"missing" domain modeled. Is there a way to do MR to predict where the missing 
domains will go in the rest of the chains, based on my solved structure?

Thanks for all the helpful suggestions!!

M

On Wed, Nov 9, 2022 at 3:11 PM Eleanor Dodson 
<176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk>
 wrote:
Well you could just try the buccaneer pipeline. It would use the phases from 
your solved domain and try to fit the missing sequence. What are your twin 
fractions? And what is the resolution?
Eleanor

On Wed, 9 Nov 2022 at 21:06, Tim Gruene 
mailto:tim.gru...@univie.ac.at>> wrote:
Dear Medhanjali DasGupta,
unless the resolution is really poor, the quickest try would be shelxe,
starting from what you already have. It might work at, say, 2.8A
resolution or better...

Best,
Tim

On Wed, 9 Nov 2022 14:34:28 -0600 Medhanjali
DasGupta mailto:medhanjalidasgu...@gmail.com>> 
wrote:

> Hello!
> My protein structure has a missing domain and I am trying to figure
> out the best way to model this missing domain using the solved
> (modeled) fixed core domain? My data is also imperfectly twinned,
> with 4 twin fractions according to refmac5.
>
>  Any help/ idea is appreciated!
>
>
>



--
--
Tim Gruene
Head of the Centre for X-ray Structure Analysis
Faculty of Chemistry
University of Vienna

Phone: +43-1-4277-70202

GPG Key ID = A46BEE1A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1


--
Thanks,
Medhanjali Dasgupta
Postdoctoral Research Scientist
Lawrence Berkeley National Laboratory
[Image removed by sender.]



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] A challenging MR problem

2022-11-09 Thread Medhanjali DasGupta
The data resolution is 2A.
I have 16 chains in my model  out of which only one of the chains has the
"missing" domain modeled. Is there a way to do MR to predict where the
missing domains will go in the rest of the chains, based on my
solved structure?

Thanks for all the helpful suggestions!!

M

On Wed, Nov 9, 2022 at 3:11 PM Eleanor Dodson <
176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk> wrote:

> Well you could just try the buccaneer pipeline. It would use the phases
> from your solved domain and try to fit the missing sequence. What are your
> twin fractions? And what is the resolution?
> Eleanor
>
> On Wed, 9 Nov 2022 at 21:06, Tim Gruene  wrote:
>
>> Dear Medhanjali DasGupta,
>> unless the resolution is really poor, the quickest try would be shelxe,
>> starting from what you already have. It might work at, say, 2.8A
>> resolution or better...
>>
>> Best,
>> Tim
>>
>> On Wed, 9 Nov 2022 14:34:28 -0600 Medhanjali
>> DasGupta  wrote:
>>
>> > Hello!
>> > My protein structure has a missing domain and I am trying to figure
>> > out the best way to model this missing domain using the solved
>> > (modeled) fixed core domain? My data is also imperfectly twinned,
>> > with 4 twin fractions according to refmac5.
>> >
>> >  Any help/ idea is appreciated!
>> >
>> >
>> >
>>
>>
>>
>> --
>> --
>> Tim Gruene
>> Head of the Centre for X-ray Structure Analysis
>> Faculty of Chemistry
>> University of Vienna
>>
>> Phone: +43-1-4277-70202
>>
>> GPG Key ID = A46BEE1A
>>
>> 
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
>> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
>> available at https://www.jiscmail.ac.uk/policyandsecurity/
>>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>


-- 
Thanks,
Medhanjali Dasgupta
Postdoctoral Research Scientist
Lawrence Berkeley National Laboratory



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] A challenging MR problem

2022-11-09 Thread Eleanor Dodson
Well you could just try the buccaneer pipeline. It would use the phases
from your solved domain and try to fit the missing sequence. What are your
twin fractions? And what is the resolution?
Eleanor

On Wed, 9 Nov 2022 at 21:06, Tim Gruene  wrote:

> Dear Medhanjali DasGupta,
> unless the resolution is really poor, the quickest try would be shelxe,
> starting from what you already have. It might work at, say, 2.8A
> resolution or better...
>
> Best,
> Tim
>
> On Wed, 9 Nov 2022 14:34:28 -0600 Medhanjali
> DasGupta  wrote:
>
> > Hello!
> > My protein structure has a missing domain and I am trying to figure
> > out the best way to model this missing domain using the solved
> > (modeled) fixed core domain? My data is also imperfectly twinned,
> > with 4 twin fractions according to refmac5.
> >
> >  Any help/ idea is appreciated!
> >
> >
> >
>
>
>
> --
> --
> Tim Gruene
> Head of the Centre for X-ray Structure Analysis
> Faculty of Chemistry
> University of Vienna
>
> Phone: +43-1-4277-70202
>
> GPG Key ID = A46BEE1A
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] A challenging MR problem

2022-11-09 Thread Tim Gruene
Dear Medhanjali DasGupta,
unless the resolution is really poor, the quickest try would be shelxe,
starting from what you already have. It might work at, say, 2.8A
resolution or better...

Best,
Tim

On Wed, 9 Nov 2022 14:34:28 -0600 Medhanjali
DasGupta  wrote:

> Hello!
> My protein structure has a missing domain and I am trying to figure
> out the best way to model this missing domain using the solved
> (modeled) fixed core domain? My data is also imperfectly twinned,
> with 4 twin fractions according to refmac5.
> 
>  Any help/ idea is appreciated!
> 
> 
> 



-- 
--
Tim Gruene
Head of the Centre for X-ray Structure Analysis
Faculty of Chemistry
University of Vienna

Phone: +43-1-4277-70202

GPG Key ID = A46BEE1A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


pgpYL0tqPBypu.pgp
Description: OpenPGP digital signature


[ccp4bb] A challenging MR problem

2022-11-09 Thread Medhanjali DasGupta
Hello!
My protein structure has a missing domain and I am trying to figure out the
best way to model this missing domain using the solved (modeled) fixed core
domain? My data is also imperfectly twinned, with 4 twin fractions
according to refmac5.

 Any help/ idea is appreciated!



-- 
Thanks,
Medhanjali Dasgupta



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Problem in generating restraint file for 5.2kDa ligand

2022-11-09 Thread Paul Emsley


On 09/11/2022 13:59, Deepak Deepak wrote:

Dear CCP4 users,

I am working towards solving a protein-ligand complex structure. The 
ligand is 5.2 kDa (495 atoms) and made of 3 distinctive repetitive 
monomers. [...]

I will happily provide more information if I am missing something here.



You're missing practically everything.

From a software developers point view "It doesn't work" is worse than 
no bug report at all.


I wrote this for Coot in particular, but much also applies to CCP4 
software (where for "terminal" read "log")


https://pemsley.github.io/coot/blog/2020/09/21/how-to-make-a-bug-report.html


Paul.




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Problem in generating restraint file for 5.2kDa ligand

2022-11-09 Thread Deepak Deepak
Dear CCP4 users,

I am working towards solving a protein-ligand complex structure. The ligand
is 5.2 kDa (495 atoms) and made of 3 distinctive repetitive monomers. I
have a PDB model for the ligand and also as smiles and mol2 format.
I tried generating restraint files for this ligand using Jligand, ProDRG,
AceDRG, Elbow (Phenix), and Gradeserver (Global phasing), but all failed to
generate the restraints. Perhaps it's too big of a molecule to handle for
restraint generation using these softwares. I checked the archived messages
on CCP4BB but could not find something helpful.

Could you advise me on how to prepare the restraints for such ligand to
proceed with the refinement?
I will happily provide more information if I am missing something here.

Kind regards,
Deepak



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] CCP4 Study Weekend 2023 - Early bird registration ends 13th November

2022-11-09 Thread Warren, Anna (DLSLtd,RAL,LSCI)
This year the CCP4 Study Weekend will be held as a hybrid event from the 4th - 
6th January at the East Midlands Conference Centre, Nottingham, UK, with an 
exciting programme confirmed below. Early bird registration is open until 13th 
November, more details of which can be found here: 
https://ccp4sw.org/
With an exciting list of confirmed speakers, the event looks to be a great 
conference, with a topic based around "Data: subtle details to big insights". 
With a break from tradition, the Study Weekend will start with a keynote 
lecture and discussion panel talking and engaging with the audience about data 
sources and the future of structural biology. The subsequent 2 days will follow 
the usual format with 3 sessions each day.  Old favourites like "What's New in 
CCP4?" and "Lunchtime Bytes" will also appear (see 
programme for more details).
A poster session, with prizes, will be taking place this time on the evening of 
the first day allowing students to present their research to the wider 
crystallographic community. Student bursaries 
are available to cover the cost of registration and one night's accommodation. 
Student registration needs to be received before 13th November to take 
advantage of this.
In keeping with previous CCP4 meetings, the lectures will focus on the 
presentation and discussion of advanced methods and techniques developed and 
used by leaders in the field, whilst having a strong teaching element aimed at 
students and early researchers.

Sessions and confirmed speakers are as follows:

Day 1
Diamond MX User Meeting
Session 1: Integrative structural biology
Linking structural biology data sources to tackle Alzheimer's - Monserrat 
Soler-Lopez
Where do we see structural biology heading in the future? Discussion panel
Gerard Bricogne, Sameer Velanker, Loes Kroon-Batenburg, Jim Naismith, Kristina 
Djinovic-Carugo, Annalisa Pastore, Dave Stuart

Day 2
Session 2: Fundamentals of crystallographic data
Aimed at covering the fundamentals of what data is and what we do with it.
Graeme Winter, Greta M. Assman, Kavin Dalton, Richard Gildea
Session 3: Fundamentals of samples and the experiment
Ensuring we get the best from our data by optimising sample preparation and the 
experiment and looking at complementary techniques.
Ralf Flaig, Kathryn Shelley, Maria Garcia, Phillippe Carpentier
Session 4: Choosing your source
Understanding the source and experiment types to better answer your biological 
question.
Meytal Landau, Antoine Royant, Arnaud Basle, Hongyi Xu

Day 3
Session 5: Big data
Learning the complexities of handling larger datasets for a variety of 
experiment types.
Derek Mendez, Marjan Hadian-Jazi, Briony Yorke, Kyle Morris
Session 6: Between the Bragg spots
Understanding your data and trying to get the most out of it.
Gloria Borgstahl, Andrey Lebedev, Steve Meisburger
Session 7: A new era in Structural Biology
Exploring possible future directions for structural biology.
Dan Rigden, Sylvain Engilberge, Isabel Uson, Anastassis Perrakis

We look forward to seeing you all there and having some insightful discussions.
Christoph Mueller-Dieckmann (ESRF, France)
Anna Warren (Diamond Light Source)
David Waterman (UKRI-STFC CCP4)
Scientific Organisers






-- 
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are 

Re: [ccp4bb] outliers

2022-11-09 Thread Manfred S. Weiss

Dear all,

what I am missing in this whole thread is the question
on what is the "true" value of a given bond distance. So
far, everybody seems to assume that the "ideal" value
is equivalent to the "true" value, and that  deviations
from the ideal values must therefore be outliers.

I challenge this notion. Most protein structures are
strained in some sense, which is not surprising given
that the degrees of freedom to fold a linear chain into
a tertiary structure are limited. This strain will inevitably
lead to deviations of geometric parameters from their
ideal values.

This is best illustrated by Ramachandran "outliers",
which are perfectly supported by electron density.
The strain caused by any one of them will distribute itself
over all neighbouring bond lengths and angles as well as
over the torsion angles.

In this context, the current definition of what an outlier
is, does not really make sense to me.

Best, Manfred

Am 09.11.2022 um 09:17 schrieb Dale Tronrud:

And now it is time for an "old man story".  Back in the early 1990's
the Brookhaven PDB started to worry about "validating" the models
being deposited.  One of the things they implemented was to add to the
header of the PDB a complete list of all bond lengths and angles that
deviated from the library value by more than 3 sigma.

   In Brian Matthews' lab a student solved the structure of
beta-galactosidase which is composed of over a thousand residues and
the crystal has 16-fold ncs.  The model had over 130,000 atoms, a
record for the time.  The PDB declared that this was one of the worst
models they had ever seen because it had hundreds of geometry
restraints violated by greater than 3 sigma.  The list in their header
went on and on.

   Our response, of course, was that this model had over 130,000 bonds
and 180,000 angles and if you assume a Normal distribution the number
of 3 sigma deviants were exactly the number expected - Which is what
the geometry rmsds were saying.

Dale E. Tronrud

On 11/8/2022 3:25 PM, James Holton wrote:

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the
"geometry" of a given PDB file.  As in: what are the odds the
deviations from ideality of this model are due to chance?

I am leaning toward the need to take all the deviations in the
structure together as a set, but, as Joao just noted, that it just
"feels wrong" to tolerate a 3-sigma deviate.  Even more wrong to
tolerate 4 sigma, 5 sigma. And 6 sigma deviates are really difficult
to swallow unless your have trillions of data points.

To put it down in equations, is the p-value of a structure with 1000
bonds in it with one 3-sigma deviate given by:

a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a
single bond length (or anything else that's single) from its
expected value is significant, since as you say there's always some
finite probability that it occurred purely by chance. Statistics can
only meaningfully be applied to samples of a 'reasonable' size.  I
know there are statistics designed for small samples but not for
samples of size 1 !  It's more meaningful to talk about
distributions.  For example if 1% of the sample contained deviations
> 3 sigma when you expected there to be only 0.3 %, that is probably
significant (but it still has a finite probability of occurring by
chance), as would be finding no deviations > 3 sigma (for a
reasonably large sample to avoid sampling errors).

Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's
say
for example its a CA-CB bond that is supposed to be 1.529 A long,
but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
much, right? But the "sigma" given to such a bond in our geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly accurate
structures, like small molecules. So, that makes this a 3-sigma
outlier.
Assuming the distribution of deviations is Gaussian, that's a
pretty
unlikely thing to happen. You expect 3-sigma deviates to appear
less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the structure.
Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at least
once.  That is, do an "experiment" where you pick 1000
Gaussian-random
numbers from a distribution with a standard deviation of 1.0.
Then, look
for the maximum over all 1000 trials. Is that one > 3 sigma? It
probably
is. If you do this "experiment" millions of times it turns out
seeing at
least 

Re: [ccp4bb] outliers

2022-11-09 Thread Dale Tronrud
   And now it is time for an "old man story".  Back in the early 1990's 
the Brookhaven PDB started to worry about "validating" the models being 
deposited.  One of the things they implemented was to add to the header 
of the PDB a complete list of all bond lengths and angles that deviated 
from the library value by more than 3 sigma.


   In Brian Matthews' lab a student solved the structure of 
beta-galactosidase which is composed of over a thousand residues and the 
crystal has 16-fold ncs.  The model had over 130,000 atoms, a record for 
the time.  The PDB declared that this was one of the worst models they 
had ever seen because it had hundreds of geometry restraints violated by 
greater than 3 sigma.  The list in their header went on and on.


   Our response, of course, was that this model had over 130,000 bonds 
and 180,000 angles and if you assume a Normal distribution the number of 
3 sigma deviants were exactly the number expected - Which is what the 
geometry rmsds were saying.


Dale E. Tronrud

On 11/8/2022 3:25 PM, James Holton wrote:

Thank you Ian for your quick response!

I suppose what I'm really trying to do is put a p-value on the 
"geometry" of a given PDB file.  As in: what are the odds the deviations 
from ideality of this model are due to chance?


I am leaning toward the need to take all the deviations in the structure 
together as a set, but, as Joao just noted, that it just "feels wrong" 
to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5 
sigma. And 6 sigma deviates are really difficult to swallow unless your 
have trillions of data points.


To put it down in equations, is the p-value of a structure with 1000 
bonds in it with one 3-sigma deviate given by:


a)  p = 1-erf(3/sqrt(2))
or
b)  p = 1-erf(3/sqrt(2))**1000
or
c) something else?



On 11/8/2022 2:56 PM, Ian Tickle wrote:

Hi James

I don't think it's meaningful to ask whether the deviation of a single 
bond length (or anything else that's single) from its expected value 
is significant, since as you say there's always some finite 
probability that it occurred purely by chance.  Statistics can only 
meaningfully be applied to samples of a 'reasonable' size.  I know 
there are statistics designed for small samples but not for samples of 
size 1 !  It's more meaningful to talk about distributions.  For 
example if 1% of the sample contained deviations > 3 sigma when you 
expected there to be only 0.3 %, that is probably significant (but it 
still has a finite probability of occurring by chance), as would be 
finding no deviations > 3 sigma (for a reasonably large sample to 
avoid sampling errors).


Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton  wrote:

OK, so lets suppose there is this bond in your structure that is
stretched a bit.  Is that for real? Or just a random fluke?  Let's
say
for example its a CA-CB bond that is supposed to be 1.529 A long,
but in
your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
much, right? But the "sigma" given to such a bond in our geometry
libraries is 0.016 A.  These sigmas are typically derived from a
database of observed bonds of similar type found in highly accurate
structures, like small molecules. So, that makes this a 3-sigma
outlier.
Assuming the distribution of deviations is Gaussian, that's a pretty
unlikely thing to happen. You expect 3-sigma deviates to appear less
than 0.3% of the time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets
say there are 1000. With that many samplings from a Gaussian
distribution you generally expect to see a 3-sigma deviate at least
once.  That is, do an "experiment" where you pick 1000
Gaussian-random
numbers from a distribution with a standard deviation of 1.0.
Then, look
for the maximum over all 1000 trials. Is that one > 3 sigma? It
probably
is. If you do this "experiment" millions of times it turns out
seeing at
least one 3-sigma deviate in 1000 tries is very common. Specifically,
about 93% of the time. It is rare indeed to have every member of a
1000-deviate set all lie within 3 sigmas.  So, we have gone from one
3-sigma deviate being highly unlikely to being a virtual certainty if
you look at enough samples.

So, my question is: is a 3-sigma deviate significant?  Is it
significant
only if you have one bond in the structure?  What about angles?
What if
you have 500 bonds and 500 angles?  Do they count as 1000 deviates
together? Or separately?

I'm sure the more mathematically inclined out there will have some
intelligent answers for the rest of us, however, if you are not a
mathematician, how about a vote?  Is a 3-sigma bond length deviation
significant? Or not?

Looking forward to both kinds of responses,

-James Holton
MAD Scientist