I thought that the wildcard * would match any atom except hydrogen, but
that's true unless hydrogens are explicit in the molecule....
I have some patterns in the form of SMILES with wildcards and implicit
hydrogens. For example C* means "terminal carbons" only.
(" * " stands for any atom except hydrogen)
I want to transform this SMILES in SMARTS, if I just write:
smarts = rdkit.MolFromSmarts('*C')
the smarts I get matches any C with AT LEAST one non-hydrogen bond (not
EXACTLY one).
If I add explicit hydrogens to the smarts (and to the molecules to be
tested)
smartsH = rdkit.AddHs(smarts)
rdkit.MolToSmiles(smartsH)
'*C([H])([H])[H]'
I get this pattern where the wildcard matches ANY atom including hydrogen
(it matches with the single carbon atom).
Basically I am trying to get the SMARTS *C[H3] starting from the respective
SMILES *C. Is there a way?
I've already tried to replace the * with a [!H] (NOT hydrogen) with no luck.
Thanks to anyone :)
Thomas
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss