Hi Thomas,

* in SMARTS just means "any atom".
[!H], for historical reasons, means "and atom without a single Hydrogen"
(i.e. it matches CH2 and CH3, but not CH)
You want [!#0], that is "not hydrogen"

-greg


On Mon, Jan 30, 2023 at 5:40 PM Thomas <odioidenti...@gmail.com> wrote:

> I thought that the wildcard * would match any atom except hydrogen, but
> that's true unless hydrogens are explicit in the molecule....
>
> I have some patterns in the form of SMILES with wildcards and implicit
> hydrogens. For example C* means "terminal carbons" only.
> (" * "  stands for any atom except hydrogen)
>
> I want to transform this SMILES in SMARTS, if I just write:
>
> smarts = rdkit.MolFromSmarts('*C')
>
> the smarts I get matches any C with AT LEAST one non-hydrogen bond (not
> EXACTLY one).
>
> If I add explicit hydrogens to the smarts (and to the molecules to be
> tested)
>
> smartsH = rdkit.AddHs(smarts)
> rdkit.MolToSmiles(smartsH)
> '*C([H])([H])[H]'
>
> I get this pattern where the wildcard matches ANY atom including hydrogen
> (it matches with the single carbon atom).
>
> Basically I am trying to get the SMARTS *C[H3] starting from the
> respective SMILES *C. Is there a way?
>
> I've already tried to replace the * with a [!H] (NOT hydrogen) with no
> luck.
> Thanks to anyone :)
> Thomas
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to