A related issue, just to make sure it is in someone's (greg's) radar.

The writer just writes a $$$$ delimiter for the failed molecules (instead
of writing nothing) -- so in the output file we get extra delimiters

<end of good molecule>
$$$$
$$$$
<beginning of another good molecule>

Which makes some other programs down the cheminformatics workflow choke.
 Fear not, its the weekend and I can sed/grep my way to a pint down the
road.

See attached files for reproduction (out.sdf is the output file).

-
Jean-Paul Ebejer
Early Stage Researcher


On 27 January 2012 10:11, Greg Landrum <[email protected]> wrote:

> Dear Jean-Paul,
>
> On Fri, Jan 27, 2012 at 4:57 AM, JP <[email protected]> wrote:
> >
> > I can read an SDF file (attached: test.sdf) using ForwardSDMolSupplier
> (with
> > sanitization explicitly turned on) - but I cannot write the molecule
> back to
> > a second SD file using SDWriter.
> > Interestingly, upon writing it fails with a sanitization error
> (ValueError:
> > "Sanitization error: Can't kekulize mol").  Note that the molecule is not
> > null and I am not doing anything with the molecule.
>
> That's a bug, thanks for pointing it out.
>
> > My question is: how did the molecule pass the initial sanitization test
> > (when it is read) but not the second (when it is written) ?
>
> The CTAB format requires that bonds be written out as either single or
> double,[1] so aromatic rings must be converted into the kekule form
> before generating the CTAB. The RDKit is incapable of generating a
> kekule form for this molecule because, I think, something is wrong
> with the way it has done chemistry perception around the boron. You
> can see this in the SMILES:
> In [2]: ms = [x for x in Chem.SDMolSupplier('test.sdf')]
>
> In [3]: Chem.MolToSmiles(ms[0])
> Out[3]: 'Cc1n[n+](C)[bH2-](O)c2sccc12'
>
> I believe that boron atom shouldn't have any explicit Hs.
>
> -greg
> [1] The actual wording of the specification says that aromatic bonds
> should only be used for substructure queries
>
#!/usr/bin/env python

from rdkit import Chem

# let us make sure sanitization is explicitely on
suppl = Chem.ForwardSDMolSupplier('test_mols.sdf')

wr = Chem.SDWriter('out.sdf')
for m in suppl:
    # we have a mol instance
    print "Here is your read mol:", m
    # but we cannot write it
    wr.write(m)

wr.flush()
wr.close()

Attachment: test_mols.sdf
Description: application/extension-sdf

Attachment: out.sdf
Description: application/extension-sdf

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to