A related issue, just to make sure it is in someone's (greg's) radar. The writer just writes a $$$$ delimiter for the failed molecules (instead of writing nothing) -- so in the output file we get extra delimiters
<end of good molecule> $$$$ $$$$ <beginning of another good molecule> Which makes some other programs down the cheminformatics workflow choke. Fear not, its the weekend and I can sed/grep my way to a pint down the road. See attached files for reproduction (out.sdf is the output file). - Jean-Paul Ebejer Early Stage Researcher On 27 January 2012 10:11, Greg Landrum <[email protected]> wrote: > Dear Jean-Paul, > > On Fri, Jan 27, 2012 at 4:57 AM, JP <[email protected]> wrote: > > > > I can read an SDF file (attached: test.sdf) using ForwardSDMolSupplier > (with > > sanitization explicitly turned on) - but I cannot write the molecule > back to > > a second SD file using SDWriter. > > Interestingly, upon writing it fails with a sanitization error > (ValueError: > > "Sanitization error: Can't kekulize mol"). Note that the molecule is not > > null and I am not doing anything with the molecule. > > That's a bug, thanks for pointing it out. > > > My question is: how did the molecule pass the initial sanitization test > > (when it is read) but not the second (when it is written) ? > > The CTAB format requires that bonds be written out as either single or > double,[1] so aromatic rings must be converted into the kekule form > before generating the CTAB. The RDKit is incapable of generating a > kekule form for this molecule because, I think, something is wrong > with the way it has done chemistry perception around the boron. You > can see this in the SMILES: > In [2]: ms = [x for x in Chem.SDMolSupplier('test.sdf')] > > In [3]: Chem.MolToSmiles(ms[0]) > Out[3]: 'Cc1n[n+](C)[bH2-](O)c2sccc12' > > I believe that boron atom shouldn't have any explicit Hs. > > -greg > [1] The actual wording of the specification says that aromatic bonds > should only be used for substructure queries >
#!/usr/bin/env python
from rdkit import Chem
# let us make sure sanitization is explicitely on
suppl = Chem.ForwardSDMolSupplier('test_mols.sdf')
wr = Chem.SDWriter('out.sdf')
for m in suppl:
# we have a mol instance
print "Here is your read mol:", m
# but we cannot write it
wr.write(m)
wr.flush()
wr.close()
test_mols.sdf
Description: application/extension-sdf
out.sdf
Description: application/extension-sdf
------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

