Hi
There appears to be an issue with the DeleteSubstructs method when deleting
groups from an aromatic N. The H-count is not reset properly leading to a
kekulise error.
The workaround is to Kekulise the molecule first. Of course, this would require
more extensive SMARTS based substructures to use the Kekule form.
Here is an example
import rdkit
from rdkit import Chem
print(rdkit.__version__)
smiles = 'c1cccn1C'
mol = Chem.MolFromSmiles(smiles)
Chem.Kekulize(mol,clearAromaticFlags=True)
sub = Chem.MolFromSmarts('[CH3]')
newmol = Chem.rdmolops.DeleteSubstructs(mol,sub)
Chem.SanitizeMol(newmol)
print("1: {}".format(Chem.MolToSmiles(newmol)))
mol = Chem.MolFromSmiles(smiles)
sub = Chem.MolFromSmarts('[CH3]')
newmol = Chem.rdmolops.DeleteSubstructs(mol,sub)
print("2: {}".format(Chem.MolToSmiles(newmol)))
Chem.SanitizeMol(newmol)
print("3: {}".format(Chem.MolToSmiles(newmol)))
With output
2021.03.2
1: c1cc[nH]c1
2: c1ccnc1
[09:50:41] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4
Traceback (most recent call last):
File "test.py", line 21, in <module>
Chem.SanitizeMol(newmol)
rdkit.Chem.rdchem.KekulizeException: Can't kekulize mol. Unkekulized atoms: 0
1 2 3 4
Thanks
Stephen
GSK monitors email communications sent to and from GSK in order to protect GSK,
our employees, customers, suppliers and business partners, from cyber threats
and loss of GSK Information. GSK monitoring is conducted with appropriate
confidentiality controls and in accordance with local laws and after
appropriate consultation.
________________________________
This e-mail was sent by GlaxoSmithKline Services Unlimited
(registered in England and Wales No. 1047315), which is a
member of the GlaxoSmithKline group of companies. The
registered address of GlaxoSmithKline Services Unlimited
is 980 Great West Road, Brentford, Middlesex TW8 9GS.
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss