Dear Andy,

Thank you for a quick and thorough email. I find it very instructional,
although I need to read it a couple times more to digest it.

Cheers,
Chenyang

On Wed, Nov 8, 2017 at 2:27 PM, Andrew Dalke <da...@dalkescientific.com>
wrote:

> On Nov 8, 2017, at 21:00, Chenyang Shi <cs3...@columbia.edu> wrote:
> > =C= : [CH0;A;X2;!R](=[$(*)])=[$(*)]
>
> The recursive SMARTS notation, which is the term inside of the [$(...)],
> finds a match for the entire pattern and returns the first atom in that
> pattern.
>
> > For example, if I search "C=C=O" using "[CH0;A;X2;!R](=[$(*)])=[$(*)]",
> > >>> from rdkit import Chem
> > >>> m = Chem.MolFromSmiles('C=C=O')
> > >>> m.GetSubstructMatches(Chem.MolFromSmarts("[CH0;A;X2;!R](=
> [$(*)])=[$(*)]"))
> > ((1, 0, 2),)
> >
> > it prints out atomic positions 1, 0, 2--three positions. But I would
> expect only one position for the Carbon in the middle.
>
> The $(*) finds the pattern, which is a "*" and in this case the terminal
> carbons, and returns it. The substructure search returns 3 positions
> because the first is [CH0;A;X2;!R], the second is the first atom of "*",
> and the third is the first atom of the other "*".
>
> If you only want the first atom the entire pattern, then put the entire
> pattern in a recursive SMARTS, as in:
>
>   [$([CH0;A;X2;!R](=*)=*)]
>
> >>> pat = Chem.MolFromSmarts("[$([CH0;A;X2;!R](=*)=*)]")
> >>> mol = Chem.MolFromSmiles('C=C=O')
> >>> mol.GetSubstructMatches(pat)
> ((1,),)
>
> > Similarly, if I search "C#C" using "[CH1;A;X2;!R]#[$(*)]",
> > >>> from rdkit import Chem
> > >>> m = Chem.MolFromSmiles('C#C')
> > >>> m.GetSubstructMatches(Chem.MolFromSmarts("[CH1;A;X2;!R]#[$(*)]"))
> > ((0, 1),)
> > I would expect two separate positions such as (0,), (1,), indicating
> there are two carbon triple bonds (with an hydrogen).
>
> Since you are only looking for a single atom, try putting the entire
> pattern in a recursive SMARTS, as in
>
>   [$([CH1;A;X2;!R]#*)]
>
> >>> mol = Chem.MolFromSmiles("C#C")
> >>> pat = Chem.MolFromSmarts("[$([CH1;A;X2;!R]#*)]")
> >>> mol.GetSubstructMatches(pat)
> ((0,), (1,))
>
>
> > Then if  if I search "CC#CC" using " [CH0;A;X2;!R]#[$(*)]",
>
> I believe you want "[$([CH0;A;X2;!R]#*)]"
>
> Thank you for your clear description of what you expected.
>
> Cheers,
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to