Re: [ccp4bb] choosing an NMR structure from PDB
Hi folks many thanks for the replies - all very helpful. I can see that my example of P01132 might have been taken as a protein that I have a real current interest in - unfortunately, it isn’t (pace! to anyone who is working on malaria), it was just the first example I found that had no structures other than NMR (for any of the 60,000 odd other UniProts in the PDB which have both NMR and X-ray structures, I’ll leave it up to you to decide which I would choose!). As David points out, PDBe-KB is incredibly useful for finding things like this (and in fact what I tend to use as a first port of call), and the sorting is useful - though (moving away from my original question to the wonderful world of X-ray) I find that it’s still necessary to engage brain when looking at the results (e.g. for P38398, 4y2g (2.5Å, Rwork 0.215, Rfree 0.252) is #1 but 1t15 (1.85Å, Rwork 0.206, Rfree 0.222) is at #6 - why? looking at the PDB-REDO entries is educational), so it’s challenging to write scripts that will scrape the whole DB and give the “best” model for each. Harry > Hi Harry, > > A useful starting point when looking for the 'best' [insert criteria here] > structure in the PDB for a specific protein is to visit the PDBe-KB > aggregated views pages (https://pdbekb.org). These pages group PDB data based > upon UniProt accession and the 'structures' tab on these pages shows all the > available PDB entries containing this protein, as well as other resources > that provide structural data for this protein (including some 'new-fangled > predicted models'). For your UniProt ID, the relevant page is > https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures. > > The list of of PDB entries is sorted to have the 'best' structure at the top > - in this case, weighted by a combination of UniProt coverage, resolution > (for X-ray/EM) and validation. It also displays information on resolution (if > applicable), any bound ligands etc. to give this context to help in choosing > a suitable structure. > > As you mention, for your example these are all NMR entries containing similar > sized fragments of the full length UniProt sequence, so the ordering is > predominantly using validation data to sort these. Unfortunately, in your > case the most recent entry was before mandatory deposition of chemical > shifts, so you do not have the option of experimental validation which is now > available for recently deposited NMR entries. Therefore these are all ordered > based on geometric validation. > > So, although there is no concrete answer to your question, the above process > should help in filtering the options. > > Kind Regards, > David > On 3 May 2023, at 13:51, Randy John Read wrote: > > Hi Harry, > > My advice would be to use one of those new-fangled predicted models. You can > find a model in the AlphaFold database at the EBI > (https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are > parts (likely corresponding to the constructs that were crystallised) that > look confidently predicted, connected by poorly-predicted loops. If you take > the PDB file and the PAE matrix, you can run process_predicted_model either > from Phenix or CCP4, which will give you individual files for the confident > parts of the full prediction that are likely to have the correct relative > orientations. (If you want to use the models for molecular replacement, > you’ll find that the least-confident parts are downweighted by being assigned > high B-factors, which is much better than having the best parts of the > models, with pLDDT near 100, downweighted the most by interpreting pLDDT as a > B-factor.) I would bet that these models will be more accurate than typical > NMR models. > > Best wishes, > > Randy > Hi Harry > > First off I would look at quality metrics. For the structures from the same > paper you will need to check what the differences are in the paper and choose > what’s closest to what you have need (experimental conditions etc) assuming > they all have similar quality > > As a secondary priority you will want to look at number of restraints (and > especially in the region of interest) > > Note I would expect the more modern structures from cyana and aria to better > due to improved methodologies > > Regards > Gary > Hi, here is my small contribution. > > The answer to your question depends quite dramatically on the intended use. > If you want the "best" structure you might want to see how many restraints > per residue were used and if high-resolution restraints as RDCs had been used. > Recall that NMR structures are build using monomer libraries that are > seriously different from the x-ray ones. > > Best, > > > E. > > >> On 3 May 2023, at 11:45, Harry Powell >> <193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote: >> >> Hi folks >> >> I was wondering. >> >> If there is a UniProt entry (for example, P01132, but there are plenty of >>
Re: [ccp4bb] choosing an NMR structure from PDB
Hi Harry, A useful starting point when looking for the 'best' [insert criteria here] structure in the PDB for a specific protein is to visit the PDBe-KB aggregated views pages (https://pdbekb.org). These pages group PDB data based upon UniProt accession and the 'structures' tab on these pages shows all the available PDB entries containing this protein, as well as other resources that provide structural data for this protein (including some 'new-fangled predicted models'). For your UniProt ID, the relevant page is https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures. The list of of PDB entries is sorted to have the 'best' structure at the top - in this case, weighted by a combination of UniProt coverage, resolution (for X-ray/EM) and validation. It also displays information on resolution (if applicable), any bound ligands etc. to give this context to help in choosing a suitable structure. As you mention, for your example these are all NMR entries containing similar sized fragments of the full length UniProt sequence, so the ordering is predominantly using validation data to sort these. Unfortunately, in your case the most recent entry was before mandatory deposition of chemical shifts, so you do not have the option of experimental validation which is now available for recently deposited NMR entries. Therefore these are all ordered based on geometric validation. So, although there is no concrete answer to your question, the above process should help in filtering the options. Kind Regards, David On 03/05/2023 11:45, Harry Powell wrote: Hi folks I was wondering. If there is a UniProt entry (for example, P01132, but there are plenty of others) for which I want the “best” (whatever that might mean) representative _experimental_ structure (i.e. not one of these new-fangled predicted models that some folk say have removed the need for actually doing experiments), but there are only NMR models - how do I choose? I don’t mean “which model from the ensemble do I choose” - that’s a different question. For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so may be in different conditions (e.g. pH). All except the first (1A3P) cover the same bit of sequence. Specifically, what should I look for in the downloadable files (mmCIF, for example) from the PDB? Thoughts? Harry To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ -- David Armstrong Outreach and Training Lead PDBe European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] choosing an NMR structure from PDB
Hi Harry, My advice would be to use one of those new-fangled predicted models. You can find a model in the AlphaFold database at the EBI (https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are parts (likely corresponding to the constructs that were crystallised) that look confidently predicted, connected by poorly-predicted loops. If you take the PDB file and the PAE matrix, you can run process_predicted_model either from Phenix or CCP4, which will give you individual files for the confident parts of the full prediction that are likely to have the correct relative orientations. (If you want to use the models for molecular replacement, you’ll find that the least-confident parts are downweighted by being assigned high B-factors, which is much better than having the best parts of the models, with pLDDT near 100, downweighted the most by interpreting pLDDT as a B-factor.) I would bet that these models will be more accurate than typical NMR models. Best wishes, Randy > On 3 May 2023, at 11:45, Harry Powell > <193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote: > > Hi folks > > I was wondering. > > If there is a UniProt entry (for example, P01132, but there are plenty of > others) for which I want the “best” (whatever that might mean) representative > _experimental_ structure (i.e. not one of these new-fangled predicted models > that some folk say have removed the need for actually doing experiments), but > there are only NMR models - how do I choose? > > I don’t mean “which model from the ensemble do I choose” - that’s a different > question. > > For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, > 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same > paper, so may be in different conditions (e.g. pH). All except the first > (1A3P) cover the same bit of sequence. > > Specifically, what should I look for in the downloadable files (mmCIF, for > example) from the PDB? > > Thoughts? > > Harry > > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > This message was issued to members of http://www.jiscmail.ac.uk/CCP4BB, a > mailing list hosted by http://www.jiscmail.ac.uk/, terms & conditions are > available at https://www.jiscmail.ac.uk/policyandsecurity/ - Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: +44 1223 336500 The Keith Peters Building Hills Road E-mail: rj...@cam.ac.uk Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] choosing an NMR structure from PDB
Hi Harry First off I would look at quality metrics. For the structures from the same paper you will need to check what the differences are in the paper and choose what’s closest to what you have need (experimental conditions etc) assuming they all have similar quality As a secondary priority you will want to look at number of restraints (and especially in the region of interest) Note I would expect the more modern structures from cyana and aria to better due to improved methodologies Regards Gary Dr Gary S Thompson NMR Facility Manager CCPN CoI & Working Group Member Wellcome Trust Biomolecular NMR Facility School of Biosciences, Faculty of Natural Sciences University of Kent, Canterbury, Kent, England, CT2 7NZ tel: 01227 82 7117 e-mail: g.s.thomp...@kent.ac.uk<mailto:g.s.thomp...@kent.ac.uk> orchid: orcid.org/-0001-9399-7636<https://urldefense.com/v3/__http:/orcid.org/-0001-9399-7636__;!!JFdNOqOXpB6UZW0!5pL_seJsFx0GRaDgwaLL0h3tmVktbehHKv07ZEZEcqZMKbC_s464UgFam-zmCFabTW0x$> From: CCP4 bulletin board on behalf of Harry Powell <193323b1e616-dmarc-requ...@jiscmail.ac.uk> Date: Wednesday, 3 May 2023 at 11:45 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] choosing an NMR structure from PDB CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe. Hi folks I was wondering. If there is a UniProt entry (for example, P01132, but there are plenty of others) for which I want the “best” (whatever that might mean) representative _experimental_ structure (i.e. not one of these new-fangled predicted models that some folk say have removed the need for actually doing experiments), but there are only NMR models - how do I choose? I don’t mean “which model from the ensemble do I choose” - that’s a different question. For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so may be in different conditions (e.g. pH). All except the first (1A3P) cover the same bit of sequence. Specifically, what should I look for in the downloadable files (mmCIF, for example) from the PDB? Thoughts? Harry To unsubscribe from the CCP4BB list, click the following link: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=VXUCSdQR%2B%2BlHvD%2FDZQi%2FBcRBNIEkiMziOtboUXSUT08%3D=0<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1> This message was issued to members of https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2FCCP4BB=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=kg3UcX0VdLg3Mpa0gxDMu5W9YMv%2BjE9cLjUHr2twY%2Bg%3D=0<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted by https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2F=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=UimgK9Z%2BvOlaKxnF%2B4BanvF1eerK6yMl9nIxPIn3Rmc%3D=0<http://www.jiscmail.ac.uk/>, terms & conditions are available at https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fpolicyandsecurity%2F=05%7C01%7CG.S.Thompson%40kent.ac.uk%7Cdae9af07a7934523289d08db4bc380d4%7C51a9fa563f32449aa7213e3f49aa5e9a%7C0%7C0%7C638187075277708989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=4rWORcxteFdPePlmM%2BZyJ1amJbpFbYG3%2FHmJ0cwB6ek%3D=0<https://www.jiscmail.ac.uk/policyandsecurity/> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
[ccp4bb] choosing an NMR structure from PDB
Hi folks I was wondering. If there is a UniProt entry (for example, P01132, but there are plenty of others) for which I want the “best” (whatever that might mean) representative _experimental_ structure (i.e. not one of these new-fangled predicted models that some folk say have removed the need for actually doing experiments), but there are only NMR models - how do I choose? I don’t mean “which model from the ensemble do I choose” - that’s a different question. For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so may be in different conditions (e.g. pH). All except the first (1A3P) cover the same bit of sequence. Specifically, what should I look for in the downloadable files (mmCIF, for example) from the PDB? Thoughts? Harry To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/