On Nov 6, 2019, at 16:32, Ivan Tubert-Brohman 
<ivan.tubert-broh...@schrodinger.com> wrote:
> For reasons to complicated to get into here, I ended up with a molecule 
> containing a =CH2 in which one of the hydrogens was explicit and had E/Z 
> stereo info. For example, consider [H]/C=C/F.

FWIW, I just ran into the same issue.

In my case, I'm using one of my favorite techniques - SMILES manipulation - to 
replace terminal atoms with a hydrogen. [1]

I thought I could replace F/C=C/Cl with [H]/C=C/Cl to delete the F, and was 
surprised to see the '[H]' in place after re-canonicalization.

I would like some what to get rid of it. I ended up using a manual version of 
the transform that Ivan recommended.

                                Andrew
                                da...@dalkescientific.com

[1] Why? I find it hard to remove an atom correctly in the RDKit, and preserve 
stereochemistry. Here I'll take a SMILES string and re-create it so the 6th 
atom of the input SMILES is the first atom of the output SMILES (the input 
SMILES is canonical):

>>> from rdkit import Chem
>>> mol = Chem.MolFromSmiles(r"CC(=O)/C=C(\O)c1cccnc1")
>>> Chem.MolToSmiles(mol, rootedAtAtom=5)
'O/C(=C\\C(C)=O)c1cccnc1'
>>> Chem.CanonSmiles('O/C(=C\\C(C)=O)c1cccnc1')
'CC(=O)/C=C(\\O)c1cccnc1'

If I replace the first atom term, "O", with an "[H]", I have double bond 
stereochemistry in the re-canonicalized output:

>>> Chem.CanonSmiles('[H]/C(=C\\C(C)=O)c1cccnc1')
'CC(=O)/C=C/c1cccnc1'

However, if I use graph edit methods to remove that same atom, I no longer have 
double bond stereochemistry:

>>> rwmol = Chem.RWMol(mol)
>>> rwmol.RemoveAtom(5)
>>> Chem.MolToSmiles(rwmol)
'CC(=O)C=Cc1cccnc1'

The issue is likely because the stereochemistry information isn't specified on 
all of the internal bonds. Consider that I can add a "/" to get the same SMILES:

>>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)c1cccnc1")  # start
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)/c1cccnc1")  # add an extra "/" to 
>>> "c1cccnc1"
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C(O)/c1cccnc1")   # remove the "\" from "\O"
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C([H])/c1cccnc1") # change the O to [H] to get 
>>> what I expected
'CC(=O)/C=C/c1cccnc1'

More specifically, it matches what I get from graph operations:

>>> mol2 = Chem.MolFromSmiles(r"CC(=O)/C=C(O)/c1cccnc1")
>>> rwmol2 = Chem.RWMol(mol2)
>>> rwmol2.RemoveAtom(5)
>>> Chem.MolToSmiles(rwmol2)
'CC(=O)/C=C/c1cccnc1'

I don't know how to do this programmatically with the RDKit API.

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to