Re: [ccp4bb] choosing an NMR structure from PDB

2023-05-03 Thread Harry Powell
Hi folks

many thanks for the replies - all very helpful. 

I can see that my example of P01132 might have been taken as a protein that I 
have a real current interest in - unfortunately, it isn’t (pace! to anyone who 
is working on malaria), it was just the first example I found that had no 
structures other than NMR (for any of the 60,000 odd other UniProts in the PDB 
which have both NMR and X-ray structures, I’ll leave it up to you to decide 
which I would choose!).

As David points out, PDBe-KB is incredibly useful for finding things like this 
(and in fact what I tend to use as a first port of call), and the sorting is 
useful - though (moving away from my original question to the wonderful world 
of X-ray) I find that it’s still necessary to engage brain when looking at the 
results (e.g. for P38398, 4y2g (2.5Å, Rwork 0.215, Rfree 0.252) is #1 but 1t15 
(1.85Å, Rwork 0.206, Rfree 0.222) is at #6 - why? looking at the PDB-REDO 
entries is educational), so it’s challenging to write scripts that will scrape 
the whole DB and give the “best” model for each.

Harry


> Hi Harry,
> 
> A useful starting point when looking for the 'best' [insert criteria here] 
> structure in the PDB for a specific protein is to visit the PDBe-KB 
> aggregated views pages (https://pdbekb.org). These pages group PDB data based 
> upon UniProt accession and the 'structures' tab on these pages shows all the 
> available PDB entries containing this protein, as well as other resources 
> that provide structural data for this protein (including some 'new-fangled 
> predicted models'). For your UniProt ID, the relevant page is 
> https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures.
> 
> The list of of PDB entries is sorted to have the 'best' structure at the top 
> - in this case, weighted by a combination of UniProt coverage, resolution 
> (for X-ray/EM) and validation. It also displays information on resolution (if 
> applicable), any bound ligands etc. to give this context to help in choosing 
> a suitable structure.
> 
> As you mention, for your example these are all NMR entries containing similar 
> sized fragments of the full length UniProt sequence, so the ordering is 
> predominantly using validation data to sort these. Unfortunately, in your 
> case the most recent entry was before mandatory deposition of chemical 
> shifts, so you do not have the option of experimental validation which is now 
> available for recently deposited NMR entries. Therefore these are all ordered 
> based on geometric validation.
> 
> So, although there is no concrete answer to your question, the above process 
> should help in filtering the options.
> 
> Kind Regards,
> David

> On 3 May 2023, at 13:51, Randy John Read  wrote:
> 
> Hi Harry,
> 
> My advice would be to use one of those new-fangled predicted models. You can 
> find a model in the AlphaFold database at the EBI 
> (https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are 
> parts (likely corresponding to the constructs that were crystallised) that 
> look confidently predicted, connected by poorly-predicted loops. If you take 
> the PDB file and the PAE matrix, you can run process_predicted_model either 
> from Phenix or CCP4, which will give you individual files for the confident 
> parts of the full prediction that are likely to have the correct relative 
> orientations. (If you want to use the models for molecular replacement, 
> you’ll find that the least-confident parts are downweighted by being assigned 
> high B-factors, which is much better than having the best parts of the 
> models, with pLDDT near 100, downweighted the most by interpreting pLDDT as a 
> B-factor.) I would bet that these models will be more accurate than typical 
> NMR models.
> 
> Best wishes,
> 
> Randy

> Hi Harry
>  
> First off I would look at quality metrics. For the structures from the same 
> paper you will need to check what the differences are in the paper and choose 
> what’s closest to what you have need (experimental conditions etc) assuming 
> they all have similar quality
>  
> As a secondary priority you will want to look at number of restraints (and 
> especially in the region of interest)
>  
> Note I would expect the more modern structures from cyana and aria to better 
> due to improved methodologies
>  
> Regards
> Gary

> Hi, here is my small contribution.
> 
> The answer to your question depends quite dramatically on the intended use. 
> If you want the "best" structure you might want to see how many restraints 
> per residue were used and if high-resolution restraints as RDCs had been used.
> Recall that NMR structures are build using monomer libraries that are 
> seriously different from the x-ray ones. 
> 
> Best,
> 
> 
> E.
> 

> 
>> On 3 May 2023, at 11:45, Harry Powell 
>> <193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
>> 
>> Hi folks
>> 
>> I was wondering.
>> 
>> If there is a UniProt entry (for example, P01132, but there are plenty of 
>> 

Re: [ccp4bb] choosing an NMR structure from PDB

2023-05-03 Thread David Armstrong

Hi Harry,

A useful starting point when looking for the 'best' [insert criteria 
here] structure in the PDB for a specific protein is to visit the 
PDBe-KB aggregated views pages (https://pdbekb.org). These pages group 
PDB data based upon UniProt accession and the 'structures' tab on these 
pages shows all the available PDB entries containing this protein, as 
well as other resources that provide structural data for this protein 
(including some 'new-fangled predicted models'). For your UniProt ID, 
the relevant page is 
https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures.


The list of of PDB entries is sorted to have the 'best' structure at the 
top - in this case, weighted by a combination of UniProt coverage, 
resolution (for X-ray/EM) and validation. It also displays information 
on resolution (if applicable), any bound ligands etc. to give this 
context to help in choosing a suitable structure.


As you mention, for your example these are all NMR entries containing 
similar sized fragments of the full length UniProt sequence, so the 
ordering is predominantly using validation data to sort these. 
Unfortunately, in your case the most recent entry was before mandatory 
deposition of chemical shifts, so you do not have the option of 
experimental validation which is now available for recently deposited 
NMR entries. Therefore these are all ordered based on geometric validation.


So, although there is no concrete answer to your question, the above 
process should help in filtering the options.


Kind Regards,
David

On 03/05/2023 11:45, Harry Powell wrote:

Hi folks

I was wondering.

If there is a UniProt entry (for example, P01132, but there are plenty of 
others) for which I want the “best” (whatever that might mean) representative 
_experimental_ structure (i.e. not one of these new-fangled predicted models 
that some folk say have removed the need for actually doing experiments), but 
there are only NMR models - how do I choose?

I don’t mean “which model from the ensemble do I choose” - that’s a different 
question.

For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 
1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so 
may be in different conditions (e.g. pH). All except the first (1A3P) cover the 
same bit of sequence.

Specifically, what should I look for in the downloadable files (mmCIF, for 
example) from the PDB?

Thoughts?

Harry


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


--
David Armstrong
Outreach and Training Lead
PDBe
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD UK



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] choosing an NMR structure from PDB

2023-05-03 Thread Randy John Read
Hi Harry,

My advice would be to use one of those new-fangled predicted models. You can 
find a model in the AlphaFold database at the EBI 
(https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are parts 
(likely corresponding to the constructs that were crystallised) that look 
confidently predicted, connected by poorly-predicted loops. If you take the PDB 
file and the PAE matrix, you can run process_predicted_model either from Phenix 
or CCP4, which will give you individual files for the confident parts of the 
full prediction that are likely to have the correct relative orientations. (If 
you want to use the models for molecular replacement, you’ll find that the 
least-confident parts are downweighted by being assigned high B-factors, which 
is much better than having the best parts of the models, with pLDDT near 100, 
downweighted the most by interpreting pLDDT as a B-factor.) I would bet that 
these models will be more accurate than typical NMR models.

Best wishes,

Randy

> On 3 May 2023, at 11:45, Harry Powell 
> <193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
>
> Hi folks
>
> I was wondering.
>
> If there is a UniProt entry (for example, P01132, but there are plenty of 
> others) for which I want the “best” (whatever that might mean) representative 
> _experimental_ structure (i.e. not one of these new-fangled predicted models 
> that some folk say have removed the need for actually doing experiments), but 
> there are only NMR models - how do I choose?
>
> I don’t mean “which model from the ensemble do I choose” - that’s a different 
> question.
>
> For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 
> 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same 
> paper, so may be in different conditions (e.g. pH). All except the first 
> (1A3P) cover the same bit of sequence.
>
> Specifically, what should I look for in the downloadable files (mmCIF, for 
> example) from the PDB?
>
> Thoughts?
>
> Harry
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of http://www.jiscmail.ac.uk/CCP4BB, a 
> mailing list hosted by http://www.jiscmail.ac.uk/, terms & conditions are 
> available at https://www.jiscmail.ac.uk/policyandsecurity/

-
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research Tel: +44 1223 336500
The Keith Peters Building
Hills Road   E-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.  
www-structmed.cimr.cam.ac.uk




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] choosing an NMR structure from PDB

2023-05-03 Thread Gary Thompson
Hi Harry

First off I would look at quality metrics. For the structures from the same 
paper you will need to check what the differences are in the paper and choose 
what’s closest to what you have need (experimental conditions etc) assuming 
they all have similar quality

As a secondary priority you will want to look at number of restraints (and 
especially in the region of interest)

Note I would expect the more modern structures from cyana and aria to better 
due to improved methodologies

Regards
Gary


Dr Gary S Thompson
NMR Facility Manager
CCPN CoI & Working Group Member

Wellcome Trust Biomolecular NMR Facility
School of Biosciences, Faculty of Natural Sciences
University of Kent, Canterbury,  Kent,
England,  CT2 7NZ


tel: 01227 82 7117
e-mail: g.s.thomp...@kent.ac.uk<mailto:g.s.thomp...@kent.ac.uk>
orchid: 
orcid.org/-0001-9399-7636<https://urldefense.com/v3/__http:/orcid.org/-0001-9399-7636__;!!JFdNOqOXpB6UZW0!5pL_seJsFx0GRaDgwaLL0h3tmVktbehHKv07ZEZEcqZMKbC_s464UgFam-zmCFabTW0x$>



From: CCP4 bulletin board  on behalf of Harry Powell 
<193323b1e616-dmarc-requ...@jiscmail.ac.uk>
Date: Wednesday, 3 May 2023 at 11:45
To: CCP4BB@JISCMAIL.AC.UK 
Subject: [ccp4bb] choosing an NMR structure from PDB
CAUTION: This email originated from outside of the organisation. Do not click 
links or open attachments unless you recognise the sender and know the content 
is safe.


Hi folks

I was wondering.

If there is a UniProt entry (for example, P01132, but there are plenty of 
others) for which I want the “best” (whatever that might mean) representative 
_experimental_ structure (i.e. not one of these new-fangled predicted models 
that some folk say have removed the need for actually doing experiments), but 
there are only NMR models - how do I choose?

I don’t mean “which model from the ensemble do I choose” - that’s a different 
question.

For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 
1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so 
may be in different conditions (e.g. pH). All except the first (1A3P) cover the 
same bit of sequence.

Specifically, what should I look for in the downloadable files (mmCIF, for 
example) from the PDB?

Thoughts?

Harry


To unsubscribe from the CCP4BB list, click the following link:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=VXUCSdQR%2B%2BlHvD%2FDZQi%2FBcRBNIEkiMziOtboUXSUT08%3D=0<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1>

This message was issued to members of 
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2FCCP4BB=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=kg3UcX0VdLg3Mpa0gxDMu5W9YMv%2BjE9cLjUHr2twY%2Bg%3D=0<http://www.jiscmail.ac.uk/CCP4BB>,
 a mailing list hosted by 
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2F=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=UimgK9Z%2BvOlaKxnF%2B4BanvF1eerK6yMl9nIxPIn3Rmc%3D=0<http://www.jiscmail.ac.uk/>,
 terms & conditions are available at 
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fpolicyandsecurity%2F=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=4rWORcxteFdPePlmM%2BZyJ1amJbpFbYG3%2FHmJ0cwB6ek%3D=0<https://www.jiscmail.ac.uk/policyandsecurity/>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] choosing an NMR structure from PDB

2023-05-03 Thread Harry Powell
Hi folks

I was wondering.

If there is a UniProt entry (for example, P01132, but there are plenty of 
others) for which I want the “best” (whatever that might mean) representative 
_experimental_ structure (i.e. not one of these new-fangled predicted models 
that some folk say have removed the need for actually doing experiments), but 
there are only NMR models - how do I choose?

I don’t mean “which model from the ensemble do I choose” - that’s a different 
question.

For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 
1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so 
may be in different conditions (e.g. pH). All except the first (1A3P) cover the 
same bit of sequence.

Specifically, what should I look for in the downloadable files (mmCIF, for 
example) from the PDB?

Thoughts?

Harry


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/