Hello,

In the past, I've had very good experience with the rooted fingerprints.
They were introduced by Vulpetti et al. as a description of local
environment of fluorine (LEF) atoms. Later on, we used it to compare
ionization sites in a Moka retraining study (Gedeck et al.).

The LEF code in the RDkit contrib directory contains the code. Here is a
slightly more generic version:

def getAtomEnvironmentFP(mol, atom, maxPathLength=7):
  """ Return the atom environment fingerprints around atom """
  fp = Torsions.GetHashedTopologicalTorsionFingerprint(mol, nBits=9192,
targetSize=maxPathLength,
                                                       fromAtoms=[atom])
  for i in range(2, maxPathLength):
    nfp = Torsions.GetHashedTopologicalTorsionFingerprint(mol, nBits=9192,
targetSize=i,
                                                          fromAtoms=[atom])
    for bit, v in nfp.GetNonzeroElements().iteritems():
      fp[bit] = fp[bit] + v
  return fp

You can modify the number of bits used in hashing and the maximum path
length; the values here worked well for the pKa study, in the LEF code,
they used maxPathLength=8 and the same number of bits. For comparison of
environments use DataStructs.BulkDiceSimilarity or
DataStructs.DiceSimilarity.

Best,

Peter



Vulpetti, A.; Hommel, U.; Landrum, G.; Lewis, R.; Dalvit, C. Design and
NMR-Based Screening of LEF, a Library of Chemical Fragments with Different
Local Environment of Fluorine. J. Am. Chem. Soc. 2009, 131 (36),
12949−12959.

Gedeck Peter, Lu Yipin, Skolnik Suzanne, Rodde Stephane, Dollinger Gavin,
Jia Weiping, Berellini Guiliano, Faller Bernard, Lombardo Franco. The
benefit of retraining pKa studied using internally measured data. J Chem
Inf Model 55 (2015) 1449-1459. [DOI:
http://dx.doi.org/10.1021/acs.jcim.5b00172]

On Mon, Nov 21, 2016 at 12:33 PM Chris Swain <sw...@mac.com> wrote:

> Hi,
>
> Thanks for this, it gives me a start.
>
> Cheers,
>
> Chris
> > On 21 Nov 2016, at 08:59, Richard Hall <richard.h...@astx.com> wrote:
> >
> > We've been looking at something similar - the following code spits out a
> canonical smiles string for each atom based at a radius of 1...maxradius.
> >
> > def atomenvironments(mol, atno, maxradius=6):
> >        for a in mol.GetAtoms():
> >                idx = a.GetIdx()
> >                print atno, idx, 0, a.GetSmarts()
> >                for iradius in xrange(0, maxradius+1):
> >                        env = Chem.FindAtomEnvironmentOfRadiusN(mol,
> iradius, idx)
> >                        amap = {}
> >                        submol=Chem.PathToSubmol(mol, env, atomMap=amap)
> >                        if amap.get(idx) is not None:
> >                                print atno, idx, iradius,
> Chem.MolToSmiles(submol, rootedAtAtom=amap[idx], canonical=True)
> >
> > You can then load the output into a DB for searching.  You might be able
> to tweak this to suit your purposes?
> >
> > best wishes
> > Richard
> >
> > -----Original Message-----
> > From: Chris Swain [mailto:sw...@mac.com]
> > Sent: 20 November 2016 18:44
> > To: rdkit-discuss@lists.sourceforge.net
> > Subject: [Rdkit-discuss] Atom Environments
> >
> > Hi,
> >
> > I have a project where I would like to find similar atom environments to
> a specified atom in a selected molecule.
> >
> > For example
> >
> > Suppose I have this query molecule C1CNCC(C1)c1ccccc1, and the selected
> atom is the nitrogen.
> >
> > I also have a file containing SMILES strings and ID for a list of
> reference molecules.
> >
> > I would like to identify the molecule within the references molecules
> that contains a nitrogen most similar to the selected atom in the query
> molecule even if the rest of the molecule is very different.
> >
> > My feeling is to start with say a 3 atom radius and if no similar atom
> is found above a set similarity to repeat the search using a 2 atom radius,
> but to be honest I suspect it will require a bit of trial and error to see
> what the optimum radius is?
> >
> > I'd then want to return the ID of the most similar molecule.
> >
> > I’ve had a look through the examples but not found anything that close.
> >
> > Cheers
> >
> > Chris
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > _______________________________________________
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > This email and any attachments thereto may contain private,
> confidential, and privileged material for the sole use of the intended
> recipient. Any review, copying or distribution of this email (or any
> attachments thereto) by others is strictly prohibited. If you are not the
> intended recipient, please delete the original and any copies of this email
> and any attachments thereto and notify the sender immediately.
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to