I think fingerprint based comparison will still not cope with equivalent structures – and I doubt that they’re guaranteed to be ‘collision’ free.
Cheers, Steve. From: Christos Kannas [mailto:chriskan...@gmail.com] Sent: 28 November 2016 17:32 To: Stephen O'hagan <soha...@manchester.ac.uk> Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] comparing two or more tables of molecules Hi Steve, I think it would be better to use a similarity metric based on fingerprints. Regards, Christos Christos Kannas Researcher Ph.D Student On 28 November 2016 at 18:25, Stephen O'hagan <soha...@manchester.ac.uk<mailto:soha...@manchester.ac.uk>> wrote: Has anyone come up with fool-proof way of matching structurally equivalent molecules? Unique Smiles or InChI String comparisons don’t appear to work presumable because there are different but equivalent structures, e.g. explicit vs non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form, tautomers etc. I also expect that comparing InChI strings might need something more than just a simple string comparison, such as masking off stereo information when you don’t care about stereo isomers. I assume there are suitable tools within RDKit that can do this? N.B. I need to collate tables from several sources that have a mix of smiles / InChI / sdf molecular representations. I usually use RDKit via Python and/or Knime. Cheers, Steve. ------------------------------------------------------------------------------ _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss