On Nov 6, 2019, at 16:32, Ivan Tubert-Brohman <ivan.tubert-broh...@schrodinger.com> wrote: > For reasons to complicated to get into here, I ended up with a molecule > containing a =CH2 in which one of the hydrogens was explicit and had E/Z > stereo info. For example, consider [H]/C=C/F.
FWIW, I just ran into the same issue. In my case, I'm using one of my favorite techniques - SMILES manipulation - to replace terminal atoms with a hydrogen. [1] I thought I could replace F/C=C/Cl with [H]/C=C/Cl to delete the F, and was surprised to see the '[H]' in place after re-canonicalization. I would like some what to get rid of it. I ended up using a manual version of the transform that Ivan recommended. Andrew da...@dalkescientific.com [1] Why? I find it hard to remove an atom correctly in the RDKit, and preserve stereochemistry. Here I'll take a SMILES string and re-create it so the 6th atom of the input SMILES is the first atom of the output SMILES (the input SMILES is canonical): >>> from rdkit import Chem >>> mol = Chem.MolFromSmiles(r"CC(=O)/C=C(\O)c1cccnc1") >>> Chem.MolToSmiles(mol, rootedAtAtom=5) 'O/C(=C\\C(C)=O)c1cccnc1' >>> Chem.CanonSmiles('O/C(=C\\C(C)=O)c1cccnc1') 'CC(=O)/C=C(\\O)c1cccnc1' If I replace the first atom term, "O", with an "[H]", I have double bond stereochemistry in the re-canonicalized output: >>> Chem.CanonSmiles('[H]/C(=C\\C(C)=O)c1cccnc1') 'CC(=O)/C=C/c1cccnc1' However, if I use graph edit methods to remove that same atom, I no longer have double bond stereochemistry: >>> rwmol = Chem.RWMol(mol) >>> rwmol.RemoveAtom(5) >>> Chem.MolToSmiles(rwmol) 'CC(=O)C=Cc1cccnc1' The issue is likely because the stereochemistry information isn't specified on all of the internal bonds. Consider that I can add a "/" to get the same SMILES: >>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)c1cccnc1") # start 'CC(=O)/C=C(\\O)c1cccnc1' >>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)/c1cccnc1") # add an extra "/" to >>> "c1cccnc1" 'CC(=O)/C=C(\\O)c1cccnc1' >>> Chem.CanonSmiles(r"CC(=O)/C=C(O)/c1cccnc1") # remove the "\" from "\O" 'CC(=O)/C=C(\\O)c1cccnc1' >>> Chem.CanonSmiles(r"CC(=O)/C=C([H])/c1cccnc1") # change the O to [H] to get >>> what I expected 'CC(=O)/C=C/c1cccnc1' More specifically, it matches what I get from graph operations: >>> mol2 = Chem.MolFromSmiles(r"CC(=O)/C=C(O)/c1cccnc1") >>> rwmol2 = Chem.RWMol(mol2) >>> rwmol2.RemoveAtom(5) >>> Chem.MolToSmiles(rwmol2) 'CC(=O)/C=C/c1cccnc1' I don't know how to do this programmatically with the RDKit API. _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss