Hi everyone,

This issue was solved with Greg off-list.

Turns out that the receptor contains 5 amino acids with AltLoc.  4 of these 
were cleaned up during preparation of the receptor for docking; the 5th was 
missed, and that turned out to be the culprit.

Cheers
Markus

From: Greg Landrum <greg.land...@gmail.com>
Sent: Thursday, June 6, 2019 8:18 PM
To: Mateo Vacacela <mvacac...@cdrd.ca>
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Sanitization Error: Explicit valence greater than 
permitted for normal protein

Hi Mateo,

On Thu, Jun 6, 2019 at 6:29 PM Mateo Vacacela 
<mvacac...@cdrd.ca<mailto:mvacac...@cdrd.ca>> wrote:

I’m getting the following error when trying to sanitize a protein from a 
published pdb file (1E66):

ValueError: Sanitization error: Explicit valence for atom # 1254 C, 5, is 
greater than permitted

The error message is telling you what the problem is: there's a carbon atom in 
the system that has a valence (=number of bonds - charges) of 5. That's illegal 
for carbon. This type of error typically indicates a problem with the input 
file.

I will note that I downloaded the PDB file for 1E66 from the PDB website and it 
worked fine for me, so something may have happened to the file you are using?

In [6]: import requests

In [9]: d = requests.get('https://files.rcsb.org/download/1E66.pdb')

In [10]: d.content[:5]
Out[10]: b'HEADE'

In [12]: with open('1e66.pdb','wb+') as outf:
    ...:     outf.write(d.content)
    ...:

In [13]: m = Chem.MolFromPDBFile('1e66.pdb',sanitize=False,removeHs=False)

In [14]: nm = Chem.SanitizeMol(m)

In [15]:



Here is the script I’m running to recreate the error. I’ve replicated it based 
off of a script from the deepchem library:

This script is very strange.

######## Script Starts ########
import tempfile
import os
from rdkit import Chem
from rdkit.Chem import rdmolops

protein_pdb = 'receptor.pdb'

with open(protein_pdb) as protein_file:
    protein_pdb_lines = protein_file.readlines()

tempdir = tempfile.mkdtemp()

protein_pdb_file = os.path.join(tempdir, "protein.pdb")
with open(protein_pdb_file, "w") as protein_f:
    protein_f.writelines(protein_pdb_lines)

This first bit seems to just be copying the file, I'm not sure why you would 
want to do that.

molecule_file = protein_pdb_file
my_mol = Chem.MolFromPDBFile(str(molecule_file), sanitize=False, removeHs=False)
mol = Chem.SanitizeMol(my_mol)  # Error occurs here

This doesn't sense. It's shorter (and produces the same result) to just do:

mol = Chem.MolFromPDBFile(str(molecule_file), removeHs=False)

That will sanitize the structure but leave the Hs.

-greg


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to