On Jul 23, 2021, at 06:42, Andrew Dalke <[email protected]> wrote:
>
> No, there's no way to do that.
>
> The best I can suggest is to go back to the original Python implementation
> and change the code leading up to
Alternatively, since your template is small, you can brute-force enumerate all
possible matching SMARTS patterns, and test them from largest to smallest.
I believe the following patterns are correct for your template. These are
ordered by number of bonds, then number of atoms, then ASCII-betically. (Note:
these many contain duplicates because Chem.MolToSmarts doesn't produce
canonical SMARTS.)
[n,c,o]1(-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1
[n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o]1(-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1
[n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]
[n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]
[n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]
[n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]
[n,c,o](-S(-*)(=O)=O):[n,c,o]
[n,c,o]-S(-*)(=O)=O
[n,c,o](-S(=O)=O):[n,c,o]
[n,c,o]-S(=O)=O
S(-*)(=O)=O
S(=O)=O
I generated it with the following:
===
from rdkit import Chem
import itertools
# Must have the atoms marked with an atom map (the atom map value is ignored).
template =
'[n,c,o]1(-[S:1](-*)(=[O:1])=[O:1]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1'
mol = Chem.MolFromSmarts(template)
# Figure out which bonds to keep
bond_atom_indices = []
for bond in mol.GetBonds():
if all(atom.HasProp("molAtomMapNumber") for atom in (bond.GetBeginAtom(),
bond.GetEndAtom())):
continue
bond_atom_indices.append((bond.GetBeginAtomIdx(), bond.GetEndAtomIdx()))
# Remove the atom maps
for atom in mol.GetAtoms():
if atom.HasProp("molAtomMapNumber"):
atom.ClearProp("molAtomMapNumber")
seen = set()
# Enumerate all possible bonds to delete (should be 2**n)
for r in range(0, len(bond_atom_indices)+1):
for delete_indices in itertools.combinations(bond_atom_indices, r):
tmp_mol = Chem.RWMol(mol)
# Remove the selected bonds
for atom1_idx, atom2_idx in delete_indices:
tmp_mol.RemoveBond(atom1_idx, atom2_idx)
# Remove any singletons. Start from the end so the indices are stable.
for atom in list(tmp_mol.GetAtoms())[::-1]:
if not atom.GetBonds():
tmp_mol.RemoveAtom(atom.GetIdx())
# Get the corresponding SMARTS
tmp_smarts = Chem.MolToSmarts(tmp_mol)
# Ensure it's singly connected
if "." in tmp_smarts:
continue
# Ensure it's unique; track the number of bonds and atoms for later
sorting
key = (tmp_mol.GetNumBonds(), tmp_mol.GetNumAtoms(), tmp_smarts)
seen.add(key)
for num_bonds, num_atoms, smarts in sorted(seen, reverse=True):
print(smarts)
===
Andrew
[email protected]
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss