Hi Susan,

That's an interesting one, and it happens with simpler molecules too:

>>> m1 = Chem.AddHs(Chem.MolFromSmiles('c1ccccc1C'))


>>> m2 = Chem.AddHs(Chem.MolFromSmiles('C1CCCCC1C'))


>>> AllChem.EmbedMolecule(m1)


0
>>> AllChem.EmbedMolecule(m2)


0
>>> m1p = Chem.RemoveHs(m1)


>>> m2p = Chem.RemoveHs(m2)


>>>
AllChem.AlignMol(m1p,m2p,atomMap=((0,0),(1,1),(2,2),(3,3),(4,4),(5,5)))


0.28968736283385105
>>> from rdkit.Chem import rdShapeHelpers


>>> rdShapeHelpers.ShapeTanimotoDist(m1p,m2p)


0.14472168905950095
>>> rdShapeHelpers.ShapeTanimotoDist(m2p,m1p)


0.1488871834228703


This is a numeric problem and, as you observed, is quite small.

Here's (at least part of) what's going on:
The RDKit's ShapeTanimoto calculation relies upon generating the molecular
shapes on grids and then calculating the tanimoto distance between the
grids. To make the grid generation as efficient as possible, the
conformations are standardized before encoding the shape. This process
consists of finding the transformation that aligns the first conformation
with the cartesian axes (i.e. moving it into the principle axis frame) and
then applying that to the second conformation. This is the source of at
least part of the order dependence.

This implementation is for historic reasons (it's how we used to do things
at my first employer) and could almost certainly be replaced with a direct
comparison using calculated overlap between gaussians placed on the
individual atoms if someone wanted to do that work.

Hope this helps,
-greg


On Mon, May 13, 2019 at 2:00 PM Susan Leung <susan.le...@st-hildas.ox.ac.uk>
wrote:

> Hello!
>
>
> I am trying to calculate the shape Tanimoto distance between two molecules
> (m1 and m2) but I am finding that I get different values depending on the
> order in which I input m1 and m2 into the ShapeTanimotoDist function.
> Should they not be the same? Here is my code and I attach the two ligand
> sdfs:
>
>
> In [1]: from rdkit import Chem
>
> In [2]: m1 = Chem.MolFromMolFile("5qgn_aligned_lig1.sdf")
>
> In [3]: m2 = Chem.MolFromMolFile("5qgi_aligned_lig.sdf")
>
> In [4]: from rdkit.Chem import rdShapeHelpers
>
> In [5]: rdShapeHelpers.ShapeTanimotoDist(m1, m2)
> Out[5]: 0.8132543103448275
>
> In [6]: rdShapeHelpers.ShapeTanimotoDist(m2, m1)
> Out[6]: 0.8145196036191297
>
> Thanks,
>
>
> Susan
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to