Hi Chris,
There's an additional step performed during sanitization that recognizes
that the implicit H needs to be on the N. The steps of a normal full
molecular sanitization operation are documented here:
http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization
The adjustHs() function is not exposed directly to Python, but you can take
care of aromaticity assignment and adjustHs in a single call with the
SanitizeMol() function:
In [21]: m = Chem.MolFromMolBlock(mb,sanitize=False)
In [22]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_
SETAROMATICITY|Chem.SANITIZE_ADJUSTHS)
Out[22]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
In [23]: Chem.MolToSmiles(m)
Out[23]: 'CC(C)c1cccc2c(Cl)cc(-c3ccc(-c4ccc(C(=O)O)cc4)[nH]3)nc12'
I highlighted the N in the heterocyle with the implicit H.
Hopefully this helps.
Best,
-greg
p.s. Note: while testing parts of this answer I uncovered a bug in
AdjustHs() that causes it to fail for molecules that include atoms with
"bad valences": https://github.com/rdkit/rdkit/issues/1605 This looks like
it should be easy to fix for the upcoming release.
On Thu, Oct 5, 2017 at 11:06 PM, Chris Murphy <chris.mur...@schrodinger.com>
wrote:
> Hi!
>
> I'm running at an issue with implicit hydrogens on aromatic heteroatoms. I
> am feeding the following sdf into a mol object:
>
>
> Mrv16c5 10021719092D
>
> 28 31 0 0 0 0 999 V2000
> -2.9000 0.2076 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -2.1871 0.6228 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
> 0.0063 0.2957 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
> -0.7486 0.6354 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -1.4657 0.2118 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 0.5599 0.9164 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -2.9000 -0.6228 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -2.1829 -1.0380 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -1.4657 -0.6228 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -3.6214 0.6228 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -0.6731 1.4574 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 0.1405 1.6335 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 3.8646 0.5976 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 1.3860 0.8367 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 3.0384 0.6731 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 4.3469 1.2687 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
> 1.7257 0.0818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 1.8641 1.5161 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 2.6861 1.4322 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 2.5477 -0.0021 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -2.1829 -1.8683 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
> -3.6297 1.4532 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 4.2085 -0.1656 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
> -3.6214 -1.0463 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -4.3385 0.1992 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -4.3385 -0.6311 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -4.3469 1.8600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> -2.9042 1.8683 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
> 2 1 1 0 0 0 0
> 3 4 1 0 0 0 0
> 4 5 1 0 0 0 0
> 5 2 2 0 0 0 0
> 6 3 1 0 0 0 0
> 7 1 1 0 0 0 0
> 8 7 1 0 0 0 0
> 9 8 2 0 0 0 0
> 10 1 2 0 0 0 0
> 11 4 2 0 0 0 0
> 12 11 1 0 0 0 0
> 13 15 1 0 0 0 0
> 14 6 1 0 0 0 0
> 15 19 1 0 0 0 0
> 16 13 2 0 0 0 0
> 17 14 2 0 0 0 0
> 18 14 1 0 0 0 0
> 19 18 2 0 0 0 0
> 20 17 1 0 0 0 0
> 21 8 1 0 0 0 0
> 22 10 1 0 0 0 0
> 23 13 1 0 0 0 0
> 24 7 2 0 0 0 0
> 25 10 1 0 0 0 0
> 26 25 2 0 0 0 0
> 27 22 1 0 0 0 0
> 28 22 1 0 0 0 0
> 24 26 1 0 0 0 0
> 9 5 1 0 0 0 0
> 6 12 2 0 0 0 0
> 20 15 2 0 0 0 0
> M END
> $$$$
>
> The nitrogen in the heterocycle should have 1 implicit hydrogen on it, and
> when I look at it after initially creating the mol object, it does. I want
> to convert it to aromatic form, so I am calling rdmolops.SetAromatize(mol).
> Once I do this however, it seems that the implicit hydrogen on the nitrogen
> is removed, which then causes an error to be thrown if I ever try to
> convert it back to kekule form or do any kind of sanitization. Maybe my
> understanding of aromaticity is wrong, but shouldn't the hydrogen be on the
> nitrogen regardless of whether or not it is considered to be in an aromatic
> form?
>
> I could be misunderstanding rdkit's aromaticity models, but for my
> purposes, I want to be able to convert the mol to either kekule or aromatic
> form depending on some configuration settings. Is there a way to manipulate
> the hydogrens on an atom in this case?
>
> Thanks!
> Chris
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss