We've been looking at something similar - the following code spits out a
canonical smiles string for each atom based at a radius of 1...maxradius.
def atomenvironments(mol, atno, maxradius=6):
for a in mol.GetAtoms():
idx = a.GetIdx()
print atno, idx, 0, a.GetSmarts()
for iradius in xrange(0, maxradius+1):
env = Chem.FindAtomEnvironmentOfRadiusN(mol, iradius,
idx)
amap = {}
submol=Chem.PathToSubmol(mol, env, atomMap=amap)
if amap.get(idx) is not None:
print atno, idx, iradius,
Chem.MolToSmiles(submol, rootedAtAtom=amap[idx], canonical=True)
You can then load the output into a DB for searching. You might be able to
tweak this to suit your purposes?
best wishes
Richard
-----Original Message-----
From: Chris Swain [mailto:[email protected]]
Sent: 20 November 2016 18:44
To: [email protected]
Subject: [Rdkit-discuss] Atom Environments
Hi,
I have a project where I would like to find similar atom environments to a
specified atom in a selected molecule.
For example
Suppose I have this query molecule C1CNCC(C1)c1ccccc1, and the selected atom is
the nitrogen.
I also have a file containing SMILES strings and ID for a list of reference
molecules.
I would like to identify the molecule within the references molecules that
contains a nitrogen most similar to the selected atom in the query molecule
even if the rest of the molecule is very different.
My feeling is to start with say a 3 atom radius and if no similar atom is found
above a set similarity to repeat the search using a 2 atom radius, but to be
honest I suspect it will require a bit of trial and error to see what the
optimum radius is?
I'd then want to return the ID of the most similar molecule.
I’ve had a look through the examples but not found anything that close.
Cheers
Chris
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
This email and any attachments thereto may contain private, confidential, and
privileged material for the sole use of the intended recipient. Any review,
copying or distribution of this email (or any attachments thereto) by others is
strictly prohibited. If you are not the intended recipient, please delete the
original and any copies of this email and any attachments thereto and notify
the sender immediately.
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss