I thought that the wildcard * would match any atom except hydrogen, but
that's true unless hydrogens are explicit in the molecule....

I have some patterns in the form of SMILES with wildcards and implicit
hydrogens. For example C* means "terminal carbons" only.
(" * "  stands for any atom except hydrogen)

I want to transform this SMILES in SMARTS, if I just write:

smarts = rdkit.MolFromSmarts('*C')

the smarts I get matches any C with AT LEAST one non-hydrogen bond (not
EXACTLY one).

If I add explicit hydrogens to the smarts (and to the molecules to be
tested)

smartsH = rdkit.AddHs(smarts)
rdkit.MolToSmiles(smartsH)
'*C([H])([H])[H]'

I get this pattern where the wildcard matches ANY atom including hydrogen
(it matches with the single carbon atom).

Basically I am trying to get the SMARTS *C[H3] starting from the respective
SMILES *C. Is there a way?

I've already tried to replace the * with a [!H] (NOT hydrogen) with no luck.
Thanks to anyone :)
Thomas
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to