Re: [ccp4bb] Fwd: [ccp4bb] Validation of structure prediction

2021-12-21 Thread Vollmar, Melanie (DLSLtd,RAL,LSCI)
I also should have added that if you use a predicted structure and you run MR 
with it and then modify it to fit the data of your novel structure then, for 
sure, MolProbity applies or other tools as you will find in the various 
packages and on the PDB site.

The most current predicted structures usually do have an energy minimisation 
step at the end but nevertheless you can always add it yourself as Xavier 
pointed out.

In general, your prediction is a very good, educated guess on what your protein 
might look like. However,  the algorithm has no clue about your crystallisation 
condition or even the true biological environment in the cell and hence cannot 
take this chemical information into account when arranging the atoms. The 
algorithm also doesn't know that you have a membrane protein or different 
domains that need to be arranged relative to each other. The artefacts 
mentioned by Xavier are most likely a result for this lack of knowledge by the 
algorithm. Or just poor performance after all, even the best predictor can't do 
magic...

Look at the pLDDT score for your prediction, a local measure for the confidence 
with which each residue was placed into 3D space. A low score (<50) means high 
uncertainty and these residues should be removed anyway.

So, open your model in Coot, look at it and remove the rubbish...

M

From: CCP4 bulletin board  on behalf of F.Xavier 
Gomis-Rüth 
Sent: 21 December 2021 10:04
To: CCP4BB@JISCMAIL.AC.UK 
Subject: [ccp4bb] Fwd: [ccp4bb] Validation of structure prediction

Dear all,
this is by far not the general case in our hands. Depending on which AlphaFold 
protocol is used, the resulting models have locally disfavourable
geometries–including clashes–, impossible chain crossovers, etc. I would 
definitively recommend everybody to go through the model in detail and perform
a final geometry minimization with Coot and/or Phenix/Refmac. And in these 
cases, general geometry validation as provided by MolProbity
provides a final proof of the computational model.
Best,
Xavier


 Forwarded Message 
Subject:Re: [ccp4bb] Validation of structure prediction
Date:   Tue, 21 Dec 2021 09:43:37 +
From:   Vollmar, Melanie (DLSLtd,RAL,LSCI) 
<64fe7ccc6b4d-dmarc-requ...@jiscmail.ac.uk><mailto:64fe7ccc6b4d-dmarc-requ...@jiscmail.ac.uk>
Reply-To:   Vollmar, Melanie (DLSLtd,RAL,LSCI) 
<mailto:melanie.voll...@diamond.ac.uk>
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>


Tristan is spot on. All the predicted structures have near perfect geometry, so 
commonly used validation tools like MolProbity can no longer be applied.

What you need to consider is biological relevance of the predicted model. Does 
the model correctly reflect residue arrangement in the active site? Are domains 
in correct relative orientation to allow for interactions and movements, 
perhaps found by some other assay? Is there appropriate room to fit a 
ligand/cofactor? Are transmembrane helices, if there are any, correctly found?

You need to map the knowledge you have of your protein to the structure and see 
if the atom positions and what you know support each other.

Cheers

M

From: CCP4 bulletin board <mailto:CCP4BB@JISCMAIL.AC.UK> 
on behalf of Tristan Croll <mailto:ti...@cam.ac.uk>
Sent: 21 December 2021 08:28
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> 
<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] Validation of structure prediction

I agree with Dale. Tools like MolProbity are not the right approach to 
validating a structure prediction. To understand why, just consider that all 
you need to do to get a perfect MolProbity score is predict every structure as 
a single long alpha helix with ideal rotamers, with a kink at each proline.

To validate a predicted structure will require a completely different toolset - 
one that I’m not sure fully exists yet.

— Tristan

> On 20 Dec 2021, at 18:47, Dale Tronrud 
> <mailto:de...@daletronrud.com> wrote:
>
>   I don't see any reason to believe that software designed to validate 
> crystallographic or NMR models would have any utility validating AlphaFold 
> predicted models.  Doesn't the prediction software already ensure that all 
> the indicators used by Molprobity are obeyed?  I'm afraid that the tools to 
> validate any new technique must be designed specifically for that technique. 
> (And when they become available they will be useless for validating 
> crystallographic models!)
>
> Dale E. Tronrud
>
>> On 12/20/2021 10:28 AM, Nicholas Clark wrote:
>> The Molprobity server can be run online and only requires the coordinates in 
>> PDB format: http://molprobity.biochem.duke.edu/ 
>> <http://molprobity.biochem.duke.edu/>.
>> Best,
>> Nick Clark
>> On Mon, Dec 20, 2021 at 11:10

Re: [ccp4bb] Validation of structure prediction

2021-12-21 Thread Vollmar, Melanie (DLSLtd,RAL,LSCI)
Tristan is spot on. All the predicted structures have near perfect geometry, so 
commonly used validation tools like MolProbity can no longer be applied.

What you need to consider is biological relevance of the predicted model. Does 
the model correctly reflect residue arrangement in the active site? Are domains 
in correct relative orientation to allow for interactions and movements, 
perhaps found by some other assay? Is there appropriate room to fit a 
ligand/cofactor? Are transmembrane helices, if there are any, correctly found?

You need to map the knowledge you have of your protein to the structure and see 
if the atom positions and what you know support each other.

Cheers

M

From: CCP4 bulletin board  on behalf of Tristan Croll 

Sent: 21 December 2021 08:28
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] Validation of structure prediction

I agree with Dale. Tools like MolProbity are not the right approach to 
validating a structure prediction. To understand why, just consider that all 
you need to do to get a perfect MolProbity score is predict every structure as 
a single long alpha helix with ideal rotamers, with a kink at each proline.

To validate a predicted structure will require a completely different toolset - 
one that I’m not sure fully exists yet.

— Tristan

> On 20 Dec 2021, at 18:47, Dale Tronrud  wrote:
>
>   I don't see any reason to believe that software designed to validate 
> crystallographic or NMR models would have any utility validating AlphaFold 
> predicted models.  Doesn't the prediction software already ensure that all 
> the indicators used by Molprobity are obeyed?  I'm afraid that the tools to 
> validate any new technique must be designed specifically for that technique. 
> (And when they become available they will be useless for validating 
> crystallographic models!)
>
> Dale E. Tronrud
>
>> On 12/20/2021 10:28 AM, Nicholas Clark wrote:
>> The Molprobity server can be run online and only requires the coordinates in 
>> PDB format: http://molprobity.biochem.duke.edu/ 
>> .
>> Best,
>> Nick Clark
>> On Mon, Dec 20, 2021 at 11:10 AM Reza Khayat > > wrote:
>>​Hi,
>>Can anyone suggest how to validate a predicted structure? Something
>>similar to wwPDB validation without the need for refinement
>>statistics. I realize this is a strange question given that the
>>geometry of the model is anticipated to be fine if the structure was
>>predicted by a server that minimizes the geometry to improve its
>>statistics. Nonetheless, the journal has asked me for such a report.
>>Thanks.
>>Best wishes,
>>Reza
>>Reza Khayat, PhD
>>Associate Professor
>>City College of New York
>>Department of Chemistry and Biochemistry
>>New York, NY 10031
>>
>>To unsubscribe from the CCP4BB list, click the following link:
>>https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>>
>> --
>> Nicholas D. Clark
>> PhD Candidate
>> Malkowski Lab
>> University at Buffalo
>> Department of Structural Biology
>> Jacob's School of Medicine & Biomedical Sciences
>> 955 Main Street, RM 5130
>> Buffalo, NY 14203
>> Cell: 716-830-1908
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
>> 
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of 
> www.jiscmail.ac.uk/CCP4BB, a mailing list 
> hosted by www.jiscmail.ac.uk, terms & conditions 
> are available at https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/

-- 
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the 

Re: [ccp4bb] AI papers in experimental macromolecular structure determination

2021-08-04 Thread Vollmar, Melanie (DLSLtd,RAL,LSCI)
I don't have a list to add here, as my review on the topic awaits feedback on 
the corrections (self-advertisement ) but perhaps we should consider that 
machine learning and AI are two different beasts. Admittingly, I don't always 
make a proper distinction either.

Surely, many of us get their heads around machine learning, which usually 
covers so called shallow learners that firmly sit in well-known concepts of 
statistics. This type of machine learning doesn't require many resources and is 
accessible to almost anyone with an average laptop. Plenty of software in MX 
and EM use these tools and no-one every thinks about them.

I think, Andrea, perhaps, was looking more into the direction of AI (based on 
so many cryo-EM references listed , where this is a standard tool), which 
requires a lot more understanding and thought as well as resources and would 
appear to many as a magic black box. This type of machine learning has only 
recently taken off due to huge leaps in hardware development, which many of us 
can't afford to buy, unless it is provided through some shared resource. Having 
said that, an average graphics card GPU is often a good start. And if one isn't 
the book reading kind (usually due to lack of time), there are lots of good 
blogs, videos and other online resources to get one into the basics.

The papers that should clearly be added, are those for protein structure 
prediction, as, in a way, they determine a structure, albeit with a different 
kind of experiment:

https://science.sciencemag.org/content/early/2021/07/19/science.abj8754
https://www.nature.com/articles/s41586-021-03819-2

Cheers

M

From: CCP4 bulletin board  on behalf of Nave, Colin 
(DLSLtd,RAL,LSCI) <64fdcfc6624b-dmarc-requ...@jiscmail.ac.uk>
Sent: 04 August 2021 09:34
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] AI papers in experimental macromolecular structure 
determination

Bernhard
What qualifies? Good question.
There are plenty of books on AI/machine learning but, as always, it is more 
efficient/lazier to read reviews than the books themselves. I think the London 
Review of Books allows limited access to its articles so most should be able to 
read this
https://www.lrb.co.uk/the-paper/v43/n02/paul-taylor/insanely-complicated-hopelessly-inadequate?referrer=https%3A%2F%2Fwww.google.com%2F
It might be interesting (though perhaps not useful)  to classify the examples 
for macromolecular structure determination in to categories such as GOFAI etc. 
However, this particular term is rather pejorative as it would mean describing 
the developers as old fashioned!

Colin




-Original Message-
From: CCP4 bulletin board  On Behalf Of Bernhard Rupp
Sent: 03 August 2021 21:00
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] AI papers in experimental macromolecular structure 
determination

Maybe we should get to the root of this - what qualifies as machine learning 
and what not?

Do nonparametric predictors such as KDE qualify?

https://www.ruppweb.org/mattprob/default.html

Happy toa dd to the confusion.

-Original Message-
From: CCP4 bulletin board  On Behalf Of Tim Gruene
Sent: Tuesday, August 3, 2021 11:59
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] AI papers in experimental macromolecular structure 
determination

Hello Andrea,

profile fitting, like it is done in mosflm
(https://doi.org/10.1107/S090744499900846X) or evalccd, or ... probably also 
qualify as AI/machine learning.

Best wishes,
Tim

On Tue, 3 Aug 2021 11:43:06 +
"Thorn, Dr. Andrea"  wrote:

> Dear colleagues,
> I have compiled a list of papers that cover the application of
> AI/machine learning methods in single-crystal structure determination
> (mostly macromolecular crystallography) and single-particle Cryo-EM.
> The draft list is attached below.
>
> If I missed any papers, please let me know. I will send the final list
> back here, for the benefit of all who are interested in the topic.
>
> Best wishes,
>
>
> Andrea.
>
>
> __
> General:
> - Gopalakrishnan, V., Livingston, G., Hennessy, D., Buchanan, B. &
> Rosenberg, J. M. (2004). Acta Cryst D. 60, 1705–1716.
> - Morris, R. J. (2004). Acta Cryst D. 60, 2133–2143.
>
> Micrograph preparation:
> - (2020). Journal of Structural Biology. 210, 107498.
>
> Particle Picking:
> - Sanchez-Garcia, R., Segura, J., Maluenda, D., Carazo, J. M. &
> Sorzano, C. O. S. (2018). IUCrJ. 5, 854–865.
> - Al-Azzawi, A., Ouadou, A., Tanner, J. J. & Cheng, J. (2019). BMC
> Bioinformatics. 20, 1–26.
> - George, B., Assaiya, A., Roy, R. J., Kembhavi, A., Chauhan, R.,
> Paul, G., Kumar, J. & Philip, N. S. (2021). Commun Biol. 4, 1–12.
> - Lata, K. R., Penczek, P. & Frank, J. (1995). Ultramicroscopy. 58,
> 381–391.
> - Nguyen, N. P., Ersoy, I., Gotberg, J., Bunyak, F. & White, T. A.
> (2021). BMC Bioinformatics. 22, 1–28.
> - Wang, F., Gong, H., Liu, G., Li, M., Yan, C., Xia, T., Li, X. &
> Zeng, J. (2016). Journal of Structural Biology. 195, 325–336.
> - Wong, H.