Hi,

FYI, Helium support this by default. For example, the following python
script illustrates this:

from helium import *

SMILES = Smiles()

mol1 = Molecule()
if not SMILES.read('[NH3][CH2][C](=O)=O', mol1):
    print SMILES.error()
    exit()

mol2 = Molecule()
if not SMILES.read('[NH3][CH2]C(=O)O', mol2):
    print SMILES.error()
    exit()

# the flags specify what should be written (e.g. Order, Hydrogens, Charge,
...)
print SMILES.write(mol1, Smiles.Flags.None), '?=', SMILES.write(mol2,
Smiles.Flags.None)

The output is:

NCC(O)O ?= NCC(O)O

However, it sounds like you really need a normalization procedure. I
already implemented SMIRKS for this but haven't written anything on top of
this yet. For example, the script below could be used to convert the
carboxyl group:

from helium import *

SMILES = Smiles()

mol1 = Molecule()
if not SMILES.read('[NH3][CH2][C](=O)=O', mol1):
    print SMILES.error()
    exit()

mol2 = Molecule()
if not SMILES.read('[NH3][CH2]C(=O)O', mol2):
    print SMILES.error()
    exit()

SMIRKS = Smirks()

# normalize C(=O)=O
if not SMIRKS.init('[Cv5:1](=[O:2])=[O:3]', '[C:1](=[O:2])[O:3]'):
    print SMIRKS.error()
    exit()

if not SMIRKS.apply(mol1, RingSet(mol1)):
    print SMIRKS.error()
    exit()

print SMILES.write(mol1), '?=', SMILES.write(mol2)

Output:

[NH3]CC(=O)O ?= [NH3]CC(=O)O

Helium is still in development but more info can be found here:
http://www.moldb.net/helium.html
The examples above use the development version from github.

Tim


On Mon, Mar 31, 2014 at 5:29 PM, Craig James <cja...@emolecules.com> wrote:

>
>
>> It would be nice if I had the ability to write (canonical) SMILES strings
>> without any bond order information, as a "round trip" to another format
>> (say, .xyz with gen3D) may cause this information to be lost and I end up
>> with different strings for the same molecule.  For example,
>> [NH3][CH2][C](=O)=O is the same molecule as [NH3][CH2]C(=O)O .
>>
>
> I think what you're saying is, "I would LIKE these two molecules to be
> interpreted as the same molecule."  The rules of SMILES say that those are
> two different molecules.  SMILES is a formal language with  specific rules
> that don't allow alternate interpretations.
>
>
>> If there was a way to delete bond order information in the OBMol object,
>> I'd be happy with that too, as it would allow me to write the string I
>> want.  Thanks!
>>
>
> That's not so easy. You'd probably need to write your own program using
> OpenBabel that erased certain types of infomation (bond order) and
> rationalized charge and H-count so that all atoms were neutral, then wrote
> out the SMILES.
>
> The problem is that the various parsers go to a lot of trouble to preserve
> the input molecules exactly as specified.
>
> Craig
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
------------------------------------------------------------------------------
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to