Hi Wandré,

your problem is the opposite - it is quite unlikely, actually impossible,
that different molecules calculate the same InChI or SMILES, your bigger
problem is, that what you regard as the same chemical, is regarded as
different ones by SMILES or InChI. The danger for this is quite big for
SMILES. it becomes better with canonical SMILES (but in my opinion, not
much), your best friend is InChI or Standard InChI.

Also, if two different molecules would calculate the same InChI or SMILES,
in all likelihood all your descriptors are very similar, too, because
SMILES, InChI etc. are just connection table representations and those
descriptor calculating algorithms just work on the connection table (so,
the molecules also look the same for any of these algorithms).

Calculation of Tanimoto coefficient-type doesn't help this problem either,
and a Tanimoto coefficient of 1 doesn't mean two molecules are identical
(they are very similar but not identical).

Markus

On Wed, Sep 13, 2017 at 8:43 PM, Wandré <wandrevel...@gmail.com> wrote:

> Thanks for all the answers.
>
> Reading all answers, I think in something different... If the SMILES
> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
> (Chem.MolToInchi(mol)) can generate the same value in different molecules,
> I will generate others descriptors (NumHDonors, NumHAcceptors,
> RingCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds
> and MolLogP) to compare all the molecules that SMILES and Inchi are the
> same.
> If all this data are the same, I will generate the fingerprint (Atompair
> for exemple) and use Tanimoto coefficient and, if this value, when I
> compare two molecules, is 1, this molecules are the same.
>
> Where is my mistake (I think that is, one or more, mistakes)?
>
> Thanks!
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk <dmaz...@bmrb.wisc.edu>:
>
>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote:
>> > The case that you have 3D information available for a molecule dataset
>> is rare, if you want it trustworthy it gets even worse than that. And what
>> is the point then to generate the configuration of a molecule first if you
>> can not trust that either?
>>
>> Veering further off topic, do you even care in the first place? E.g. if
>> your molecule always exists as a mixture of isomers, except in some
>> megabuck-per-microgram painstakingly created reference samples, a
>> 3D-based system will represent it as two distinct molecules. Whereas you
>> want it represented as one.
>>
>> Last I looked PDB Ligand Expo had two different benzenes. Their software
>> doesn't (didn't?) do the circle version so they don't have the third one.
>>
>> --
>> Dimitri Maziuk
>> Programmer/sysadmin
>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to