Here’s the documentation that served as the reading for the fall 2015 OLCC
course on cheminformatics, from
http://olcc.ccce.divched.org/2015OLCCModule5P1#3.2
Because the layered structure of InChI allows one to represent a chemical
structure with a desired level of details, InChI software may generate
different InChI strings for the same molecule. This flexibility may be
regarded as an obstacle to standardization and interoperability. In response
to this concern, the standard InChI was introduced which contains the same
level of structural details and the same conventions for drawing perception, by
using standard option settings in InChI software. The standard InChI
representations begin with “InChI=1S/”, while the non-standard InChI begins
with “InChI=1/”. The digit “1” following “InChI=” is the current InChI version
number.
InChIKey
The length of an InChI string increases with the size of the corresponding
chemical structure, and it is very common that molecules with more than 100
atoms result in very long InChI strings, which are not appropriate to use in
internet search engines (such as Google, Yahoo, Bing, and so on). In addition,
these search engines do not care about case sensitivity nor special characters
used in InChI. To address this issue, the InChIKey was introduced for Internet
and database searching/indexing. It is a 27-character string derived from
InChI, using a hashing algorithm. Hashing is a one-way mathematical
transformation typically used to calculate a compact fixed length digital
representation of a much longer string of arbitrary length.
I guess I paid more attention to the phrase about “obstacle to standardization
and interoperability” than I did to the part about it being “molecules with
more than 100 atoms” having long InChI strings.
Jennifer
Jennifer Muzyka
H.W. Stodghill Jr. and Adele H. Stodghill Professor of Chemistry
Centre College
600 West Walnut Street
Danville, KY 40422
jennifer.muz...@centre.edu<mailto:jennifer.muz...@centre.edu>
http://web.centre.edu/muzyka
http://organicers.org
859-238-5413
fax 859-236-7925
On Jul 22, 2016, at 10:10 AM, Robert Hanson
<hans...@stolaf.edu<mailto:hans...@stolaf.edu>> wrote:
On Fri, Jul 22, 2016 at 8:42 AM, Jennifer L. Muzyka
<jennifer.muz...@centre.edu<mailto:jennifer.muz...@centre.edu>> wrote:
InChI is very messy because there’s more than one version of the program that
generates it. So depending on what version you use, you get a different InChI.
That information about which version of the InChI rules you are using is an
early part of the string. The other problem with InChI is that the strings can
be REALLY LONG, as in so long that it’s not possible to use them when you
search Google. That was another take-away from the course.
You do have to define the rules so that you get the sections right. Beyond
that, I suppose it may be true, but not for simple molecules. Any changes by
version would be for esoteric species (I am guessing.) What you are saying
violates the whole premise of InChI. Do you have examples, Jennifer?
2. Quick Facts
2.1. What is an InChI?
InChI is an acronym for IUPAC International Chemical Identifier. It is a string
of characters capable of uniquely representing a chemical substance and serving
as its unique digital ‘signature’. It is derived solely from a structural
representation of that substance in a way designed to be independent of the way
that the structure was drawn. A single compound will always produce the same
identifier.
[http://www.inchi-trust.org/technical-faq/]
You are using AJAX every time you use JSmol. All files are transmitted using
AJAX, and within JSmol you can do AJAX as simply as
x = load("http://......")
I don't see why you would need any server-side piece these days. It is
certainly not "elegant" in my opinion. Elegant is
if ("my SMILES string".find("your SMILES string", "SMILES")) prompt "You're
good to go!"
without any concern for where the strings come from. That's what JSmol does. No
server required. Just make sure you are using the right options in JSME. See
http://chemapps.stolaf.edu/jmol/jsmol/jsmetest2.htm
}
var JMEInfo = {
use: "HTML5"
,visible: true
,divId: "jmediv"
,options : "autoez;nocanonize"
}
Here's a note I have on that from an earlier jmol-users post:
JSME and 2D/3D - It turns out that JSME has two modes of delivery of SMILES --
"canonize" and "nocanonize"...The problem is that "canonize" delivers aromatic
symbols for rings that are not huekel-aromatic -- all six carbons, for
example, in benzoquinone. Jmol does this, too, but Jmol adds double bond
indications as well, so ...c=cc=c.... The difference is significant -- Jmol's
SMILES representations are interpretable by the NCI Resolver; JSME's "canonize"
versions are not. So I have to have JSME in nocanonize mode in order to
convert 2D to 3D using the NCI Resolver.
Bob
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net<mailto:Jmol-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/jmol-users
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
Jmol-users mailing list
Jmol-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jmol-users