Hi Emanuel,

The chirality bit doesn't have anything to do with double bond
stereochemistry.[1] So that's not what's going on here

The RDKit has the ability to pass the mol block provided directly to the
InChI code without interpreting it. I believe that the ChEMBL team is using
that to generate InChIs. In any case, where I use that to pass the molblock
downloaded from ChEMBL (
https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL6223.sdf) to the InChI
code I get the same InChI that is found in ChEMBL.

In this particular case I believe the bug may be in the NAOMI code.

-greg
[1] According to the documentation it tells you about whether or not a
molfile with specified atomic stereochemistry represents a single
stereoisomer  (the one drawn) or that only the relative configurations of
the specified stereocenters is known and that the structure is either a
single diastereomer or a mixture of the two stereoisomers.

On Wed, Jul 29, 2020 at 11:17 AM Emanuel Ehmki <emanuel.eh...@gmail.com>
wrote:

> Dear All,
>
> I am currently working with the RDKit generated SDF String that is stored
> in the ChEMBL COMPOUND_STRUCTURES table in the ChEMBL database release 26.
> My workflow is:
>
>    - pull SDF (V2000) from SQL table
>    - generate internal molecule representation (NAOMI ChemBio tool-kit if
>    that means anything to you)
>    - generate InChI string and key from molecule
>    - compare with InChI string and key that are stored in the ChEMBL
>    database
>
> When comparing the InChI string for the molecule with the id CHEMBL6223, I
> get two differing strings due to different stereochemistry (last characters)
>
> ChEMBL
> InChI: 
> InChI=1S/C16H13IO2/c17-10-12-8-9-15(16(18)19-12)14-7-3-5-11-4-1-2-6-13(11)14/h1-7,10,15H,8-9H2/
>    *b12-10+*
> NAOMI InChI
> : 
> InChI=1S/C16H13IO2/c17-10-12-8-9-15(16(18)19-12)14-7-3-5-11-4-1-2-6-13(11)14/h1-7,10,15H,8-9H2/
>     *b12-10-*
>
> While researching why that happens I realized that the SDF string doesn't
> make use of the chirality bit that can be set in the counts line.
> When digging deeper I found the disabled block in the MolFileWriter.cpp ->
> MolToMolBlock function
>
> https://github.com/rdkit/rdkit/blob/f14f8a60de0ecf4bf5294d73b177d19055e0096d/Code/GraphMol/FileParsers/MolFileWriter.cpp#L1395
>
> Do I understand correctly that RDKit does not store any information about
> chirality in V2000 and includes chiral information only in V3000 SDF format?
>
> Does anyone know when ChEMBL might switch to that version?
>
> Kind regards,
> Emanuel
> _______________________________________________
> Rdkit-devel mailing list
> Rdkit-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-devel
>
_______________________________________________
Rdkit-devel mailing list
Rdkit-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-devel

Reply via email to