Dear Tao-wei,

On Fri, May 22, 2009 at 2:59 AM, Tao-wei Huang <[email protected]> wrote:
> Dear All,
> I encountered problems while reading compounds containing Nitro group (-NO2)
> or building molecules containing smarts like [N+][O-]. It seems that rdkit
> only allows the maximum valence of N to be 3 in these processes. One example
> is as followings (the sdf file is attached):
>>>> spl=cm.SDMolSupplier('pyrazoloacridine.sdf')
>>>> print spl[0]
> None
>>>> mm=cm.MolFromMolFile('pyrazoloacridine.sdf')
> Traceback (most recent call last):
>   File "<pyshell#20>", line 1, in <module>
>     mm=cm.MolFromMolFile('pyrazoloacridine.sdf')
> ValueError: Sanitization error: Explicit valence for atom # 13 N greater
> than permitted
> Could anyone provide some solution for this problem? I did't notice this
> problem when I started building model. But now I cannot run through the
> whole dataset probably because of this problem. Thanks in advance.

The RDKit is, by design, very picky about valence and only tries to
repair bad valences in very limited situations. Nitro groups are, in
fact, one case where the system will clean things up. For example,
both of these work:

[2] >>> m = Chem.MolFromSmiles('C[N+](=O)([O-])')
[3] >>> m = Chem.MolFromSmiles('CN(=O)(=O)')

But here an error is raised:
[4] >>> m = Chem.MolFromSmiles('CN(=O)(O)')
[06:11:51] Explicit valence for atom # 1 N greater than permitted

Your input file contains an N like the one in [4] above: neutral and
four coordinate. The fix is to either change the single N-O bond (the
bond between atoms 14 and 19) to a double or to set the appropriate
charges on atoms 14 and 19 in your SD file. [Note: just in case it
causes confusion, the RDKit numbers the atoms starting at zero, but in
the SD file they are numbered starting at 1, so when the RDKit
complains about atom #13, it's numbered 14 in the SD file.]

Best Regards,
-greg

Reply via email to