Thanks Greg,

That clears up how to go about fixing that. I'm left with a bit of confusion 
though, I should be able to add explicit hydrogens to my SMILES to get the 
result, but I did a simple case and this didn't work:

>>> p = Chem.MolFromSmarts('[#7]-[#1]')
>>> m = Chem.MolFromSmiles('c1cc[nH]c1')
>>> m.HasSubstructMatch(p)
False
>>> for atom in m.GetAtoms(): print atom.GetNumExplicitHs()
...
0
0
0
1
0

Is this a distinction between the naming of hydrogens and the occurrence of an 
atom on the molecular graph?

Best,
Nick

Nicholas C. Firth | PhD Student | Cancer Therapeutics
The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey 
| SM2 5NG
T 020 8722 4033 | E nicholas.fi...@icr.ac.uk<mailto:nicholas.fi...@icr.ac.uk> | 
W www.icr.ac.uk<http://www.icr.ac.uk/> | Twitter 
@ICRnews<https://twitter.com/ICRnews>
Facebook 
www.facebook.com/theinstituteofcancerresearch<http://www.facebook.com/theinstituteofcancerresearch>
Making the discoveries that defeat cancer

[cid:image001.gif@01CE053D.51D3C4E0]

On 1 May 2014, at 14:44, Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:

Hi Nick,

On Thu, May 1, 2014 at 2:07 PM, Nicholas Firth 
<nicholas.fi...@icr.ac.uk<mailto:nicholas.fi...@icr.ac.uk>> wrote:

I have (yet another) question about the handling of SMARTS. I have a set of 
SMARTS (http://www.macinchem.org/reviews/pains/painsFilter.php) which I have 
been using to perform PAINS filters but I've just discovered some strange 
behaviour, I would expect a match to happen in the example below.

>>> p = Chem.MolFromSmarts('[#6]-[#6](=[#16])-[#1]')
>>> m = Chem.MolFromSmiles('CC=S')
>>> m.HasSubstructMatch(p)
False

That SMARTS, as formulated and being parsed is looking for an H atom that is 
explicitly present in the molecule graph. Unless you're planning on running 
AddHs() on everything, it won't work.

You can get the behavior you expect using the mergeHs argument to 
MolFromSmarts():

In [3]: p = Chem.MolFromSmarts('[#6]-[#6](=[#16])-[#1]',mergeHs=True)
In [4]: m = Chem.MolFromSmiles('CC=S')
In [5]: m.HasSubstructMatch(p)
Out[5]: True

This removes the [#1] from the graph and adds a [!H0] query to the attached 
carbon:
In [6]: Chem.MolToSmarts(p)
Out[6]: '[#6]-[#6&!H0]=[#16]'

This can be fixed using the alternative form of the SMARTS

>>> p2 = Chem.MolFromSmarts('[#6]-[#6H](=[#16])')
>>> m.HasSubstructMatch(p2)
True

This is roughly equivalent to what happens above. The difference is that it's 
querying for a C that has a single H attached. In this case that's fine.

-greg



The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addressee only.  If the 
message is received by anyone other than the addressee, please return the 
message to the sender by replying to it and then delete the message from your 
computer and network.
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to