Hi There appears to be an issue with the DeleteSubstructs method when deleting groups from an aromatic N. The H-count is not reset properly leading to a kekulise error. The workaround is to Kekulise the molecule first. Of course, this would require more extensive SMARTS based substructures to use the Kekule form.
Here is an example import rdkit from rdkit import Chem print(rdkit.__version__) smiles = 'c1cccn1C' mol = Chem.MolFromSmiles(smiles) Chem.Kekulize(mol,clearAromaticFlags=True) sub = Chem.MolFromSmarts('[CH3]') newmol = Chem.rdmolops.DeleteSubstructs(mol,sub) Chem.SanitizeMol(newmol) print("1: {}".format(Chem.MolToSmiles(newmol))) mol = Chem.MolFromSmiles(smiles) sub = Chem.MolFromSmarts('[CH3]') newmol = Chem.rdmolops.DeleteSubstructs(mol,sub) print("2: {}".format(Chem.MolToSmiles(newmol))) Chem.SanitizeMol(newmol) print("3: {}".format(Chem.MolToSmiles(newmol))) With output 2021.03.2 1: c1cc[nH]c1 2: c1ccnc1 [09:50:41] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Traceback (most recent call last): File "test.py", line 21, in <module> Chem.SanitizeMol(newmol) rdkit.Chem.rdchem.KekulizeException: Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Thanks Stephen GSK monitors email communications sent to and from GSK in order to protect GSK, our employees, customers, suppliers and business partners, from cyber threats and loss of GSK Information. GSK monitoring is conducted with appropriate confidentiality controls and in accordance with local laws and after appropriate consultation. ________________________________ This e-mail was sent by GlaxoSmithKline Services Unlimited (registered in England and Wales No. 1047315), which is a member of the GlaxoSmithKline group of companies. The registered address of GlaxoSmithKline Services Unlimited is 980 Great West Road, Brentford, Middlesex TW8 9GS.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss