Hi Chris,

There's an additional step performed during sanitization that recognizes
that the implicit H needs to be on the N. The steps of a normal full
molecular sanitization operation are documented here:
http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization

The adjustHs() function is not exposed directly to Python, but you can take
care of aromaticity assignment and adjustHs in a single call with the
SanitizeMol() function:

In [21]: m = Chem.MolFromMolBlock(mb,sanitize=False)

In [22]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_
SETAROMATICITY|Chem.SANITIZE_ADJUSTHS)
Out[22]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [23]: Chem.MolToSmiles(m)
Out[23]: 'CC(C)c1cccc2c(Cl)cc(-c3ccc(-c4ccc(C(=O)O)cc4)[nH]3)nc12'

I highlighted the N in the heterocyle with the implicit H.

Hopefully this helps.

Best,
-greg
p.s. Note: while testing parts of this answer I uncovered a bug in
AdjustHs() that causes it to fail for molecules that include atoms with
"bad valences": https://github.com/rdkit/rdkit/issues/1605 This looks like
it should be easy to fix for the upcoming release.


On Thu, Oct 5, 2017 at 11:06 PM, Chris Murphy <chris.mur...@schrodinger.com>
wrote:

> Hi!
>
> I'm running at an issue with implicit hydrogens on aromatic heteroatoms. I
> am feeding the following sdf into a mol object:
>
>
>   Mrv16c5 10021719092D
>
>  28 31  0  0  0  0            999 V2000
>    -2.9000    0.2076    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.1871    0.6228    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
>     0.0063    0.2957    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.7486    0.6354    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -1.4657    0.2118    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.5599    0.9164    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.9000   -0.6228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.1829   -1.0380    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -1.4657   -0.6228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -3.6214    0.6228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.6731    1.4574    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.1405    1.6335    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     3.8646    0.5976    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     1.3860    0.8367    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     3.0384    0.6731    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     4.3469    1.2687    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
>     1.7257    0.0818    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     1.8641    1.5161    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     2.6861    1.4322    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     2.5477   -0.0021    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.1829   -1.8683    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
>    -3.6297    1.4532    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     4.2085   -0.1656    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
>    -3.6214   -1.0463    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -4.3385    0.1992    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -4.3385   -0.6311    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -4.3469    1.8600    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.9042    1.8683    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>   2  1  1  0  0  0  0
>   3  4  1  0  0  0  0
>   4  5  1  0  0  0  0
>   5  2  2  0  0  0  0
>   6  3  1  0  0  0  0
>   7  1  1  0  0  0  0
>   8  7  1  0  0  0  0
>   9  8  2  0  0  0  0
>  10  1  2  0  0  0  0
>  11  4  2  0  0  0  0
>  12 11  1  0  0  0  0
>  13 15  1  0  0  0  0
>  14  6  1  0  0  0  0
>  15 19  1  0  0  0  0
>  16 13  2  0  0  0  0
>  17 14  2  0  0  0  0
>  18 14  1  0  0  0  0
>  19 18  2  0  0  0  0
>  20 17  1  0  0  0  0
>  21  8  1  0  0  0  0
>  22 10  1  0  0  0  0
>  23 13  1  0  0  0  0
>  24  7  2  0  0  0  0
>  25 10  1  0  0  0  0
>  26 25  2  0  0  0  0
>  27 22  1  0  0  0  0
>  28 22  1  0  0  0  0
>  24 26  1  0  0  0  0
>   9  5  1  0  0  0  0
>   6 12  2  0  0  0  0
>  20 15  2  0  0  0  0
> M  END
> $$$$
>
> The nitrogen in the heterocycle should have 1 implicit hydrogen on it, and
> when I look at it after initially creating the mol object, it does. I want
> to convert it to aromatic form, so I am calling rdmolops.SetAromatize(mol).
> Once I do this however, it seems that the implicit hydrogen on the nitrogen
> is removed, which then causes an error to be thrown if I ever try to
> convert it back to kekule form or do any kind of sanitization. Maybe my
> understanding of aromaticity is wrong, but shouldn't the hydrogen be on the
> nitrogen regardless of whether or not it is considered to be in an aromatic
> form?
>
> I could be misunderstanding rdkit's aromaticity models, but for my
> purposes, I want to be able to convert the mol to either kekule or aromatic
> form depending on some configuration settings. Is there a way to manipulate
> the hydogrens on an atom in this case?
>
> Thanks!
> Chris
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to