Otis,

Yes, StdInChI is a good way of determining if two molecules from different sources have the same connectivity/stereochemistry and indeed this part of the reason for InChI's creation.*

The process of creating a standard InChI discards tautomer and resonance form information, so different sources with different representations should still give the same InChI. Representation can sometimes still be a problem for some inorganic compounds. InChI breaks all bonds to metals, but molecules like ferrocene still end up with two possible InChIs depending on if the Iron is considered neutral or +2 charged. InChI also doesn't handle some issues like ring-chain tautomerism.

A comparison with canonical SMILES will have all these issues and be tautomer/resonance form specific (although of course one could apply "business rules" to normalize representation prior to creating the canonical SMILES)

Daniel
* You can also use the layered nature of InChI to check if two molecules are different whether that difference is due to different constitution/connectivity/hydrogen placement or stereochemistry

On 10/05/2016 17:52, Otis Rothenberger wrote:
Daniel,

Thanks for taking the time to jump in here!

As long as we have your attention, I’ll ask a question about InChI string comparison:

Is a simple string comparison of two **standard** InChI’s, perhaps from different sources, an acceptable method for determining compound identity? By identity, I mean the same kind of connective identity that could be established with a stereo SMILES comparison.

If there is uncertainty in the above InChI comparison, would pulling the **standard** InChI from a single source like Resolver improve the comparison?

Otis

--
Otis Rothenberger
o...@chemagic.org <mailto:o...@chemagic.org>
http://chemagic.org

On May 10, 2016, at 12:20 PM, Daniel Lowe <dan...@nextmovesoftware.com <mailto:dan...@nextmovesoftware.com>> wrote:

Dear Pierluigi and Otis,

OPSIN as a Java library and via its JSON api (e.g. http://opsin.ch.cam.ac.uk/opsin/benzene.json) can return standard or non-standard InChI.

The non-standard InChI OPSIN produces differs from StdInChI by including the fixed hydrogen layer (i.e. tautomer information). The reasoning for this is that IUPAC names are typically tautomer specific so at the time I initially added InChI support (quite a few years ago now) I thought it would be useful to retain this information. As InChI is hierarchical one can get an InChI with identical layers as StdInChI by stripping the fixed hydrogen layer (using the InChI toolkit or otherwise). Obviously the same is not true in reverse as StdInChI (intentionally) lacks tautomer information.

In practice utilisation of the layered nature of InChI is rare and virtually all sites with InChI now use StdInChI...so perhaps I should change opsin.ch.cam.ac.uk <http://opsin.ch.cam.ac.uk> to display StdInChI in the GUI.

Standard InChIKey is used as this is the format of InChI key used in various internet databases hence allowing one to Google for a specific structure.

As pointed out earlier the NCI's chemical identifier resolver uses a copy of OPSIN internally....albeit I think the version is quite may now be quite out of date.

My main goal while developing OPSIN was to use it in automated text-mining. As a result while I aim to keep to an absolute minimum cases where the resultant structure is not in agreement with the name, I do tolerate minor errors in the input e.g. missing/extra spaces. As names like "5-chlorohexane" only have one plausible interpretation trying to flag these as "incorrect" was a low priority. I'm not actually aware of any existing name to structure algorithm that would reject that name. I think to solve the general case (e.g. 2-propylbutane ["incorrect"] vs 3-methylhexane [correct]) you'd have to written a fair chunk of a chemical structure to name algorithm to know that it was incorrect!

Daniel

"For reasons that I do not fully understand, OPSIN returns InChI - not standard 
InChI. It does, however, return standard InChIKey! In general, I’d be very nervous about 
comparing InChI strings created by different resources."
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users



------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j


_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users


--
Daniel Lowe
Senior Software Engineer
NextMove Software Limited
Registered in England No. 07588305
Registered Office: Innovation Centre (Unit 23), Cambridge Science Park, 
Cambridge CB4 0EY

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users

Reply via email to