Otis,
Yes, StdInChI is a good way of determining if two molecules from
different sources have the same connectivity/stereochemistry and indeed
this part of the reason for InChI's creation.*
The process of creating a standard InChI discards tautomer and resonance
form information, so different sources with different representations
should still give the same InChI.
Representation can sometimes still be a problem for some inorganic
compounds. InChI breaks all bonds to metals, but molecules like
ferrocene still end up with two possible InChIs depending on if the Iron
is considered neutral or +2 charged. InChI also doesn't handle some
issues like ring-chain tautomerism.
A comparison with canonical SMILES will have all these issues and be
tautomer/resonance form specific (although of course one could apply
"business rules" to normalize representation prior to creating the
canonical SMILES)
Daniel
* You can also use the layered nature of InChI to check if two molecules
are different whether that difference is due to different
constitution/connectivity/hydrogen placement or stereochemistry
On 10/05/2016 17:52, Otis Rothenberger wrote:
Daniel,
Thanks for taking the time to jump in here!
As long as we have your attention, I’ll ask a question about InChI
string comparison:
Is a simple string comparison of two **standard** InChI’s, perhaps
from different sources, an acceptable method for determining compound
identity? By identity, I mean the same kind of connective identity
that could be established with a stereo SMILES comparison.
If there is uncertainty in the above InChI comparison, would pulling
the **standard** InChI from a single source like Resolver improve the
comparison?
Otis
--
Otis Rothenberger
o...@chemagic.org <mailto:o...@chemagic.org>
http://chemagic.org
On May 10, 2016, at 12:20 PM, Daniel Lowe
<dan...@nextmovesoftware.com <mailto:dan...@nextmovesoftware.com>> wrote:
Dear Pierluigi and Otis,
OPSIN as a Java library and via its JSON api (e.g.
http://opsin.ch.cam.ac.uk/opsin/benzene.json) can return standard or
non-standard InChI.
The non-standard InChI OPSIN produces differs from StdInChI by
including the fixed hydrogen layer (i.e. tautomer information). The
reasoning for this is that IUPAC names are typically tautomer
specific so at the time I initially added InChI support (quite a few
years ago now) I thought it would be useful to retain this
information. As InChI is hierarchical one can get an InChI with
identical layers as StdInChI by stripping the fixed hydrogen layer
(using the InChI toolkit or otherwise). Obviously the same is not
true in reverse as StdInChI (intentionally) lacks tautomer information.
In practice utilisation of the layered nature of InChI is rare and
virtually all sites with InChI now use StdInChI...so perhaps I should
change opsin.ch.cam.ac.uk <http://opsin.ch.cam.ac.uk> to display
StdInChI in the GUI.
Standard InChIKey is used as this is the format of InChI key used in
various internet databases hence allowing one to Google for a
specific structure.
As pointed out earlier the NCI's chemical identifier resolver uses a
copy of OPSIN internally....albeit I think the version is quite may
now be quite out of date.
My main goal while developing OPSIN was to use it in automated
text-mining. As a result while I aim to keep to an absolute minimum
cases where the resultant structure is not in agreement with the
name, I do tolerate minor errors in the input e.g. missing/extra
spaces. As names like "5-chlorohexane" only have one plausible
interpretation trying to flag these as "incorrect" was a low
priority. I'm not actually aware of any existing name to structure
algorithm that would reject that name. I think to solve the general
case (e.g. 2-propylbutane ["incorrect"] vs 3-methylhexane [correct])
you'd have to written a fair chunk of a chemical structure to name
algorithm to know that it was incorrect!
Daniel
"For reasons that I do not fully understand, OPSIN returns InChI - not standard
InChI. It does, however, return standard InChIKey! In general, I’d be very nervous about
comparing InChI strings created by different resources."
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data
untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users
--
Daniel Lowe
Senior Software Engineer
NextMove Software Limited
Registered in England No. 07588305
Registered Office: Innovation Centre (Unit 23), Cambridge Science Park,
Cambridge CB4 0EY
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users