CHEMBL1185048 triggers an interesting RDKit behavior.

If I strip salts from it, generate a SMILES string, and try to re-parse the 
SMILES string, I get the error message "Can't kekulize mol".

That is, RDKit won't read its own output SMILES.

Here's a reproducible. I'll start with the parts that work.


>>> from rdkit import Chem
>>> from rdkit.Chem import SaltRemover
>>> remover = SaltRemover.SaltRemover()
>>> 
>>> import urllib2
>>> chembl1185048 = 
>>> urllib2.urlopen("https://www.ebi.ac.uk/chembldb/download_helper/getmol/658998";)
>>> for mol in Chem.ForwardSDMolSupplier(chembl1185048):
...   break
... 
>>> Chem.MolToSmiles(mol)
'COCCOCCOCCOC(C1(C(OCCOCCOCCOC)=O)C23C11C4=c5c6c7c8c5c2c2c5c3c3c9c%10c%11c%12c%13c9c9c%14c%15c(c1c93)C4C1c3c4c9c%16c%17c(c7c7c%18c%19c%20c(c%21c%12C%12%22CN(CCOCCOCCN)CC%12(c9c(c3-%15)C%14C%13%22)C%16C%21c%20c7%17)c%11c(c%19c2c8%18)c%105)c4c61)=O'
>>> Chem.MolFromSmiles(Chem.MolToSmiles(mol))
<rdkit.Chem.rdchem.Mol object at 0x101da3d70>
>>> 

Now I'll strip salts and generate the SMILES. Again, this works as expected:

>>> mol2 = remover.StripMol(mol)
>>> Chem.MolToSmiles(mol2)
'COCCOCCOCCOC(C1(C(OCCOCCOCCOC)=O)C23C11C4=c5c6c7c8c5c2c2c5c3c3c9c%10c%11c%12c%13c%14c%15c%11c(c%11c%16c(c8c2%11)c2c7c7c8c2c(c%15%16)C%14C2c8c8c%11c%14c%15c%16c8C22CN(CCOCCOCCN)CC%132C2C%16c8c(c3c1c(c8=%15)C4C%14c6c7%11)c9c%122)c%105)=O'


However, this SMILES string isn't parseable, which I didn't expect.

>>> Chem.MolFromSmiles(Chem.MolToSmiles(mol2))
[16:42:13] Can't kekulize mol 


Yet another thing to stack on Greg's desk while he's on holiday. :)


                                Andrew
                                [email protected]



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to