Hi,
I am trying to use rdkit to replace matched SMARTS patterns in a molecule with
a wildcard (*), and return a SMARTS string where the original molecule is an
instance of this returned SMARTS string,
I tried the following:########from rdkit import Chem
def generate_modified_smarts(smiles, smarts_patterns, num_patterns_to_replace):
molecule = Chem.MolFromSmiles(smiles) patterns_replaced = 0
for smarts in smarts_patterns: if patterns_replaced >=
num_patterns_to_replace: break
pattern = Chem.MolFromSmarts(smarts) while
molecule.HasSubstructMatch(pattern) and patterns_replaced <
num_patterns_to_replace: match_indices =
molecule.GetSubstructMatch(pattern)
# Extract segments before and after the match
before_match, after_match = "", "" if match_indices[0] > 0:
before_match = Chem.MolFragmentToSmarts(molecule,
atomsToUse=list(range(match_indices[0]))) if match_indices[-1] <
molecule.GetNumAtoms() - 1: after_match =
Chem.MolFragmentToSmarts(molecule, atomsToUse=list(range(match_indices[-1] + 1,
molecule.GetNumAtoms())))
# Combine parts with a wildcard modified_smarts =
before_match + '*' + after_match molecule =
Chem.MolFromSmarts(modified_smarts) patterns_replaced += 1
return Chem.MolToSmarts(molecule)
example_smiles =
"CCOC1=C(C=C2C(=C1)N=CC(=C2NC3=CC(=C(C=C3)OCC4=CC=CC=N4)Cl)C#N)NC(=O)C=CCN(C)C"smarts_patterns
= ["C=O", "C#N"]num_patterns_to_replace = 2
modified_smarts = generate_modified_smarts(example_smiles, smarts_patterns,
num_patterns_to_replace)print(f"Modified molecule SMARTS pattern:
{modified_smarts}")#######
While it seems to work for C=O, it does not for C#N and the connectivity is
messed up for C#N, even if I use it alone, i.e. without the carbonyl. The
matched patterns could be anywhere in the molecule and could be more complex
than this, but I just tried some simple cases to see how robust is this
approach. It worked for "CCO", but did not work when i tried "Cl".
I am wondering if this is something you can help with,
Marawan
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss