Hi Thomas, * in SMARTS just means "any atom". [!H], for historical reasons, means "and atom without a single Hydrogen" (i.e. it matches CH2 and CH3, but not CH) You want [!#0], that is "not hydrogen"
-greg On Mon, Jan 30, 2023 at 5:40 PM Thomas <odioidenti...@gmail.com> wrote: > I thought that the wildcard * would match any atom except hydrogen, but > that's true unless hydrogens are explicit in the molecule.... > > I have some patterns in the form of SMILES with wildcards and implicit > hydrogens. For example C* means "terminal carbons" only. > (" * " stands for any atom except hydrogen) > > I want to transform this SMILES in SMARTS, if I just write: > > smarts = rdkit.MolFromSmarts('*C') > > the smarts I get matches any C with AT LEAST one non-hydrogen bond (not > EXACTLY one). > > If I add explicit hydrogens to the smarts (and to the molecules to be > tested) > > smartsH = rdkit.AddHs(smarts) > rdkit.MolToSmiles(smartsH) > '*C([H])([H])[H]' > > I get this pattern where the wildcard matches ANY atom including hydrogen > (it matches with the single carbon atom). > > Basically I am trying to get the SMARTS *C[H3] starting from the > respective SMILES *C. Is there a way? > > I've already tried to replace the * with a [!H] (NOT hydrogen) with no > luck. > Thanks to anyone :) > Thomas > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss