Hi, I know that the RDKit makes no guarantees abount being able to round-trip (Smiles -> Mol -> Smiles -> Mol) every molecule, but I would like to know if there are any recommendations on how to handle such cases.
In my current case the problem seems to lie in different aromatic models for a large ring containing oxygen. This is the code to reproduce the issue and there is also a more illustrative Notebook (https://gist.github.com/apahl/06e55f5965cb82bc43d2aafd8ee0d532): from rdkit.Chem import AllChem as Chem from rdkit.Chem import Draw from rdkit.Chem import Descriptors as Desc from rdkit.Chem.Draw import IPythonConsole # RDKit can parse the original Smiles into a valid molecule. mol = Chem.MolFromSmiles("c12c(\C=C/c(ccc3OC)cc3Oc4ccc(cc4)\C=C/c(cc5O1)c(CCN(C)C)cc5OC)c(CCN(C)C)c(OC)c(OC)c2OC") print(Desc.MolWt(mol)) # And parse it back into a Smiles. smi = Chem.MolToSmiles(mol) print(smi) # -> COc1ccc2cc1-o-c1ccc(cc1)/c=c\c1cc(c(OC)cc1CCN(C)C)-o-c1c(c(CCN(C)C)c(OC)c(OC)c1OC)/c=c\2 # But the Smiles generated by RDKit can not be parsed back into a valid molecule. tmp = Chem.MolFromSmiles(smi) # -> RDKit ERROR: [10:39:41] Can't kekulize mol. Unkekulized atoms: 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 23 24 31 32 33 39 42 45 48 49 BTW, the JS Molecule Editor by Peter is also not able to round-trip this molecule. Many thanks in advance, Axel
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss