On 24/07/2014 17:06, Wallace Chan wrote:
> Tim,
>
> Thanks for your reply. Yes, we have the canonical SMILES strings stored
> as properties in our glass.sdf file. I tried to generate canonical SMILES
> as the result, and they are different than ours. Thus, ours were
> probably acquired using a different canonicalization.

It is possible to recover your canonical SMILES from glass.sdf and add 
it to the title of the results file:

obabel glass.sdf -ifs -O results.smi -sc1ccccc1 --append SMILESSTRING

where SMILESSTRING is the name of the sdf property. You could also 
construct a canonical SMILES file:

obabel glass.sdf -ifs -O results.smi -otxt -sc1ccccc1 --title "" 
--append "SMILESSTRING"

For each matching molecule, the output format txt gives just the title, 
which --title "" removes; the SMILES is then added. Other properties or 
descriptors could be added, e.g. --append "SMILESSTRING inchi"

This then leads to
> another question that has come to me. Does the input for substructure or
> similarity searching have to be in SDF format or can it be another
> format, such as a list of InChI ID's? In other words, does the fast
> search index have to come from an SDF file? Many thanks.

The datafile (and the output query results) can be in any format, 
including inchi.

Chris
>
> On Tue, Jul 22, 2014 at 7:44 PM, Tim Vandermeersch
> <tim.vandermeer...@gmail.com <mailto:tim.vandermeer...@gmail.com>> wrote:
>
>     Hi,
>
>     I assume you have canonical SMILES strings in glass.sdf stored as
>     titles or properties. Correct me if this is incorrect. If so, it
>     depends on what program was used to create these canonical SMILES
>     strings. If you used openbabel for this, you can convert the
>     molecules in result.smi to openbabel canonical SMILES (or write
>     canonical SMILES directly using the .can extension).
>
>     In the case where another program was used to generate the canonical
>     SMILES, it would not be possible to use openbabel to generate the
>     same canonical SMILES starting from result.smi. If you have access
>     to the other program you could use this to convert results.smi to
>     these canonical SMILES and use these to search glass.sdf.
>
>     The reason for this is that there is no universal SMILES
>     canonicalization algorithm. Different toolkits will result in
>     different canonical SMILES (which are canonical only when using the
>     same toolkit). InChI on the hand has a single reference implementation.
>
>     Tim
>
>
>     On Wed, Jul 23, 2014 at 12:03 AM, Wallace Chan <walla...@umich.edu
>     <mailto:walla...@umich.edu>> wrote:
>
>         Dr. Hutchison,
>
>         Yes, this helps. I do have another question about substructure
>         searching. We are building a database with roughly 270,000
>         molecules and want users to be able to do a substructure and
>         similarity search. I've read the following documentation,
>         http://openbabel.org/docs/dev/Fingerprints/fingerprints.html,
>         and it helps in understand how this process works. However, I
>         want to ask whether or not the output file from the query can
>         contain the exact same SMILES strings that were generated from
>         the fast search index. Currently, the SMILES strings generated
>         from the query in the result.smi file are not the canonical
>         SMILES that I used to create the fast search index. For example,
>         if I were to look for a benzene substructure with the following
>         command,
>
>         *babel glass.fs -ifs -sc1ccccc1 result.smi*
>
>         would I be able to retrieve the SMILES string from glass.sdf,
>         which was used to create glass.fs? Many thanks for your patience.
>

------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to