On 02/23/2015 06:12 AM, Markus Sitzmann wrote:

> And I am not saying it is perfect, it just provides another
> implementation to double-check things in question. It has the CACTVS
> chemoinformatic toolkit as chemistry backend which I think is
> well-tested.

> On Mon, Feb 23, 2015 at 10:54 AM, JP <jeanpaul.ebe...@inhibox.com> wrote:

>> 3,257 molecules (of 6,940,083) gave me a different inchis between the
>> current production version and the development (github) one.

Just as another data point, out of 14 metabolites that's gone through my
scripts so far:

1. cis-vaccenic acid, PubChem CID 5282761: InChI produced by RDKit
2014.09.2 differs from that from OpenBabel. InChI from PubChem's SDF
agrees with the latter. (RDKit's ends with "/b8-7+", OB & PC: "7-".)

InChI code does spit out "undefined stereochemistry" warnings for OB's
"we don't need no C.I.P., it's sooo last century" stereo -- and then
includes the stereo layer in the output anyway. (Though in this case
PubChem's presumably OpenEye stereo seems to agree with OB and not RDKit.)

2. 5,10,15,20-Tetraphenyl-21H,23H-porphine zinc, PubChem CID 3580039:
InChI in the SDF ends at the "/q", both RDKit and OpenBabel add "/b"
layer. They all agree, though.

14 data points is nowhere near enough for meaningful conclusions, but
still... 14% won't match the plain string comparison that most searches
do and 7% won't match the "clever InChI-aware comparison" search,
assuming it's implemented anywhere.

And then you get .05% between different versions of the same software...

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to