Re: [ccp4bb] Correcting 3-letter codes based on protonation states in a PDB file

Jared Sampson Thu, 29 Jun 2017 14:36:26 -0700

Thanks to Dave, Tristan and David for the suggestions.  

The code Tristan posted is similar to what I had been thinking about doing, and 
probably will do in the future, although not being familiar with ChimeraX, I'll 
likely end up using Biopython.


For this particular round, for expediency (and since it's been a while since I 
really used Biopython), I just went through the file residue-by-residue, in 
brute-force fashion using grep and Vim.  It wasn't as bad as I initially 
thought.

First I got a list of the side chain hydrogens in question:

    grep -E "HD1 HIS|HE2 HIS" my_structure.pdb > his.txt

Then for each His residue, I checked the text file to se whether HD1 or HE2 or 
both were present, and modified the residue code with a substitution in Vim:

    :%s /HIS A 251/HIP A 251/
    etc.

It only took me about 15 minutes to do this for His/Asp/Glu for 2 PDBs, which 
is somewhat faster than I could have re-learned how to do it in one of the 
(clearly more appropriate) scripting languages.  Not exactly elegant, but it 
got the job done.  Of course, if it looks like I'll have to do this more 
frequently in the future, I'll likely go the scripting route next time.

Thanks everyone!

Cheers,
Jared

 

> On Jun 29, 2017, at 4:46 AM, Tristan Croll <ti...@cam.ac.uk> wrote:
> 
> This can be done in a few lines of script in any structural biology package 
> that provides a Python (or other) shell. Here's how I'd do it in ChimeraX, 
> for example:
> 
> Assuming your model is the only one loaded and atom names are all standard:
> 
> m = session.models.list()[0]
> histidines = m.residues.filter(m.residues.names == 'HIS')
> for his in histidines:
>    names = his.atoms.names
>    he2 = 'HE2' in names
>    hd1 = 'HD1' in names
>    if hd1 and not he2:
>        his.name = 'HID'
>    elif he2 and not hd1:
>        his.name = 'HIE'
>    elif hd1 and he2:
>        his.name = 'HIP'
>    else:
>        raise RuntimeError('HIS {}:{} is missing both hydrogens!'.format(
>            h.chain_id, h.number))
> 
> Cheers,
> 
> Tristan
> 
> On 2017-06-29 07:09, Briggs, David C wrote:
>> I believe the ProPka or Pdb2pqr webservers can do this.
>> ProPka.org <http://propka.org/>
>> http:// [1]nbcr [1]-222.ucsd.edu/pdb2pqr_2.0.0/ 
>> <http://222.ucsd.edu/pdb2pqr_2.0.0/> [1]
>> HTH,
>> Dave
>> --
>> Dr David C Briggs
>> Hohenester Lab
>> Department of Life Sciences
>> Imperial College London
>> UK
>> http://about.me/david_briggs <http://about.me/david_briggs> [2]
>> -------------------------
>> FROM: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of
>> Sampson, Jared <jms2...@cumc.columbia.edu>
>> SENT: Wednesday, June 28, 2017 11:34:05 PM
>> TO: CCP4BB@JISCMAIL.AC.UK
>> SUBJECT: [ccp4bb] Correcting 3-letter codes based on protonation
>> states in a PDB file
>> Dear all -
>> I'm working with a PDB file with explicit hydrogens where many of the
>> histidines are in protonated form due to crystallization at low pH.
>> Unfortunately, although the additional protons are present in the
>> model for the positively charged histidines, the residues in question
>> are indicated in both the SEQRES and the ATOM records as 3-letter code
>> `HIS` regardless of protonation state (i.e. instead of `HIP` for
>> positively charged, and `HID` or `HIE` for the neutral tautomers).
>> Are there existing tools available to determine the proper 3-letter
>> residue code for titratable amino acid residues based on which
>> hydrogens are present, and output a corrected PDB file?
>> Thank you in advance for your suggestions.
>> Cheers,
>> Jared Sampson
>> Ph.D. Candidate
>> Columbia University
>> Links:
>> ------
>> [1] http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/ 
>> <http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/>
>> [2] http://about.me/david_briggs <http://about.me/david_briggs>

Re: [ccp4bb] Correcting 3-letter codes based on protonation states in a PDB file

Reply via email to