Thanks for this writeup Rocco. You're right that there's not an easy to
find and understand collection of this information. That's one of those
gaps in the documentation that I should eventually address. This is already
a pretty good start though.

-greg


On Thu, Sep 8, 2016 at 9:37 PM, Rocco Moretti <rmoretti...@gmail.com> wrote:

> Greg can correct me if I'm wrong(1), but in RDKit there's actually three
> "levels" of hydrogens:
>
> * "Physical" hydrogens, which are represented as actual, independent atoms
> in the atom graph. ("Physical hydrogens" is what I'm calling them - I don't
> know if RDKit has an official term for them.)
>
> * "Explicit" hydrogens, which are represented as a numeric annotation on
> their attached heavy atom. (And *not* as a separate atom object.)
>
> * "Implicit" hydrogens, which aren't actually represented anywhere, but
> are calculated from the standard valence of the heavy atom, and how many
> are occupied by actual atoms and explicit hydrogens.
>
> Generally, except for some coordinate calculations, RDKit seems to be
> built around working with molecules with explicit or implicit hydrogens.
> This is why when you read in a molecule, RDKit normally removes any
> physical hydrogens. (Note that for most file reading code there's a
> removeHs parameter you can set to False to change this behavior, and read
> explicitly listed hydrogens as physical hydrogens.)
>
> By default "removing hydrogens" means turning them into implicit
> hydrogens(2), but the RemoveHs() function has an "updateExplicitCount"
> parameter which will cause the removed hydrogens to be turned into explicit
> hydrogens instead. The standard MOL file loading code doesn't use this
> option, though, so the hydrogens in the molecule are usually converted into
> implicit when you read things in.
>
> AddHs(), of course, turns explicit and implicit hydrogens into physical
> hydrogens. (Though the "explicitOnly" parameter can be used to control
> this.) It does annotate whether these physical hydrogens came from either
> the implicit or explicit pool, so you can round trip things through AddHs()
> and RemoveHs() appropriately. (There's also a "implicitOnly" parameter on
> RemoveHs() which will only remove those hydrogens.)
>
> Regards,
> -Rocco
>
> (1) I don't think the RDKit hydrogen model has ever been formalized in one
> place for user-facing documentation, so this is the understanding I've
> gotten from banging my head against various hydrogen-related issues.
>
> (2) There's special complications here that there are certain structures,
> such as imidazole, which needs physical or explicit hydrogens on one of the
> nitrogens in order to Kekulize properly. If you're implicit only, the RDKit
> sanitizer will choke. Thus, there's special casing in various Add/RemoveHs
> function to avoid implicit-izing these critical hydrogens.
>
> On Thu, Sep 8, 2016 at 1:46 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu>
> wrote:
>
>> On 09/08/2016 10:25 AM, Greg Landrum wrote:
>> ...
>> > Why do you want 2D drawings that include H atoms?
>>
>> On the subject of H atoms: when I read in the MOL file that has them, I
>> need to explicitly call AddHs() in order to have them drawn.
>>
>> Question: do they actually get stripped off by the reader and re-added
>> by AddHs()? Or are they there "hidden" somehow and AddHs() just
>> "unhides" them?
>>
>> TIA
>> --
>> Dimitri Maziuk
>> Programmer/sysadmin
>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>
>>
>> ------------------------------------------------------------
>> ------------------
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> ------------------------------------------------------------
> ------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to