On Thu, Nov 9, 2017 at 6:32 AM, Brian Cole <col...@gmail.com> wrote:

> Hi Cheminformaticians,
>
> This is an extreme subtlety in the interpretation of SMILES atom
> stereochemistry and I think a bug in RDKit. Specifically, I think the
> following SMILES should be the same molecule:
>
> >>> rdkit.__version__
> '2017.09.1'
> >>> Chem.CanonSmiles('F[C@@]1(C)CCO1')
> 'C[C@]1(F)CCO1'
> >>> Chem.CanonSmiles('[C@@](F)1(C)CCO1')
> 'C[C@@]1(F)CCO1'
>

As was discussed in the comments of
https://github.com/rdkit/rdkit/issues/786, I think it's pretty gross that
the second syntax is even legal. But that's a side point.

Since there is no hydrogen inside the stereo carbon atom block the bond
> being 'looked down' should be the first atom encountered. In both cases
> above, that should be the Florine, therefore the molecules should be
> equivalent.
>

Agreed, and this is a view that's further supported by this behavior:

In [2]: Chem.CanonSmiles('F[C@@]1(C)CCO1')
Out[2]: 'C[C@]1(F)CCO1'

In [3]: Chem.CanonSmiles('F[C@@](C)1CCO1')
Out[3]: 'C[C@@]1(F)CCO1'

Would you mind filing a bug for this and I'll try to track it down/fix it?

Thanks,
-greg



>
> Though it could be argued the 2nd one is not strict SMILES as Andrew
> describes here: https://github.com/rdkit/rdkit/issues/786
>
> It is useful when recombining fragments with ring closure digits for these
> to be equivalent:
> [*][C@]1(C)CCO1
> [C@]([*])1(C)CCO1
>
> Also, every other tool I can get my hands on agrees they're the same:
> OEChem, OpenBabel, indigo, and ChemAxon. (CDK lacks a simple enough
> canonicalization example for me to work from.)
>
> Sure wish there was a SMILES validation test suite we could all run
> against. And so I'm attaching the examples I used to verify the above so
> whatever poor soul assigned that task later can find this on Google. (I'm
> hopeful :-)
>
> Thanks,
> Brian
>
> PS: the current output from the script:
>
> $ python stereo_handling_first_atom.py
> RDKit = 2017.09.1
> OEChem = 2.1.2
> OpenBabel = 2.4.1
> indigo = 1.2.3.r0-g98188eb mac10.7
> RDKit failed to recognize these as the same:
> [*:1][C@]1([*:2])CC1(Cl)Cl -> ClC1(Cl)C[C@]1([*:1])[*:2]
> [C@]([*:1])1([*:2])CC1(Cl)Cl -> ClC1(Cl)C[C@@]1([*:1])[*:2]
> OpenBabel failed to recognize these as the same:
> Cl[S@](C)=O -> C[S@](=O)Cl
> [S@](Cl)(C)=O -> C[S@@](=O)Cl
> Indigo failed to recognize these as the same:
> Cl[S@](C)=O -> C[S@](=O)Cl
> [S@](Cl)(C)=O -> C[S@@](=O)Cl
> OpenBabel failed to recognize these as the same:
> Cl[S@](C)=CCCC -> CCCC=[S@](Cl)C
> [S@](Cl)(C)=CCCC -> CCCC=[S@@](Cl)C
> Indigo failed to recognize these as the same:
> Cl[S@](C)=CCCC -> CCCC=[S@@](C)Cl
> [S@](Cl)(C)=CCCC -> CCCC=[S@](C)Cl
> RDKit failed to recognize these as the same:
> Cl[C@](F)1CC[C@H](F)CC1 -> F[C@H]1CC[C@](F)(Cl)CC1
> [C@](Cl)(F)1CC[C@H](F)CC1 -> F[C@H]1CC[C@@](F)(Cl)CC1
> RDKit failed to recognize these as the same:
> Cl[C@]1(c2ccccc2)NCCCS1 -> Cl[C@]1(c2ccccc2)NCCCS1
> [C@](Cl)1(c2ccccc2)NCCCS1 -> Cl[C@@]1(c2ccccc2)NCCCS1
> RDKit failed to recognize these as the same:
> Cl3.[C@]31(c2ccccc2)NCCCS1 -> Cl[C@]1(c2ccccc2)NCCCS1
> [C@](Cl)1(c2ccccc2)NCCCS1 -> Cl[C@@]1(c2ccccc2)NCCCS1
> RDKit failed to recognize these as the same:
> Cl[C@](F)1C2C(C1)CNC2 -> F[C@@]1(Cl)CC2CNCC21
> [C@](Cl)(F)1C2C(C1)CNC2 -> F[C@]1(Cl)CC2CNCC21
> RDKit failed to recognize these as the same:
> [*][C@@H]1CO1 -> [*][C@@H]1CO1
> [C@H]([*])1CO1 -> [*][C@H]1CO1
> RDKit failed to recognize these as the same:
> [*][C@@]1(C)CCO1 -> [*][C@@]1(C)CCO1
> [C@@]([*])1(C)CCO1 -> [*][C@]1(C)CCO1
> RDKit failed to recognize these as the same:
> F[C@@]1(C)CCO1 -> C[C@]1(F)CCO1
> [C@@](F)1(C)CCO1 -> C[C@@]1(F)CCO1
> RDKit failed to recognize these as the same:
> Cl[C@@H]1[C@@H](Cl)C(Cl)CCN1 -> ClC1CCN[C@H](Cl)[C@H]1Cl
> [C@H](Cl)1[C@@H](Cl)C(Cl)CCN1 -> ClC1CCN[C@@H](Cl)[C@H]1Cl
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to