Dear Hongbin,

kekulizing the Smiles does indeed work in this case, that's a good tip.

Thanks a lot.

KR Axel

On 03.09.19 11:26, Hongbin Yang wrote:

Hi Axel,

The format like "c1cccc1” is implicitly defining the bonds between
atoms and  the output of the “canonical” SMILES has redundant ring
labels (1 and 2) , which I think confused the parser and caused the
problem.

I have no idea whether it is a bug or whether it is true reason.

But try this to solve your problem.

smi = Chem.MolToSmiles(mol, kekuleSmiles=True)

You will get the explicit bonds in the SMILES and it can be read by
`MolFromSmiles`

COC1:C:C:C2:C:C:1OC1:C:C:C(:C:C:1)/C=C\\C1:C:C(:C(OC):C:C:1CCN(C)C)OC1:C(:C(CCN(C)C):C(OC):C(OC):C:1OC)/C=C\\2

Best,

Hongbin

*发件人: *Axel Pahl <mailto:[email protected]>
*发送时间: *2019年9月3日16:45
*收件人: *RDKit Discuss <mailto:[email protected]>
*主题: *[Rdkit-discuss] Non Round-trippable Molecule

Hi,

I know that the RDKit makes no guarantees abount being able to
round-trip (Smiles -> Mol -> Smiles -> Mol) every molecule, but I
would like to know if there are any recommendations on how to handle
such cases.

In my current case the problem seems to lie in different aromatic
models for a large ring containing oxygen.

This is the code to reproduce the issue and there is also a more
illustrative Notebook
(https://gist.github.com/apahl/06e55f5965cb82bc43d2aafd8ee0d532):



from rdkit.Chem import AllChem as Chem
from rdkit.Chem import Draw
from rdkit.Chem import Descriptors as Desc
from rdkit.Chem.Draw import IPythonConsole

# RDKit can parse the original Smiles into a valid molecule.
mol =
Chem.MolFromSmiles("c12c(\C=C/c(ccc3OC)cc3Oc4ccc(cc4)\C=C/c(cc5O1)c(CCN(C)C)cc5OC)c(CCN(C)C)c(OC)c(OC)c2OC")
print(Desc.MolWt(mol))

# And parse it back into a Smiles.
smi = Chem.MolToSmiles(mol)
print(smi)  # ->
COc1ccc2cc1-o-c1ccc(cc1)/c=c\c1cc(c(OC)cc1CCN(C)C)-o-c1c(c(CCN(C)C)c(OC)c(OC)c1OC)/c=c\2

# But the Smiles generated by RDKit can not be parsed back into a
valid molecule.
tmp = Chem.MolFromSmiles(smi)
# -> RDKit ERROR: [10:39:41] Can't kekulize mol. Unkekulized atoms: 2
3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 23 24 31 32 33 39 42 45 48 49



BTW, the JS Molecule Editor by Peter is also not able to round-trip
this molecule.

Many thanks in advance,
Axel

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to