Thanks Greg,
That clears up how to go about fixing that. I'm left with a bit of confusion
though, I should be able to add explicit hydrogens to my SMILES to get the
result, but I did a simple case and this didn't work:
>>> p = Chem.MolFromSmarts('[#7]-[#1]')
>>> m = Chem.MolFromSmiles('c1cc[nH]c1')
>>> m.HasSubstructMatch(p)
False
>>> for atom in m.GetAtoms(): print atom.GetNumExplicitHs()
...
0
0
0
1
0
Is this a distinction between the naming of hydrogens and the occurrence of an
atom on the molecular graph?
Best,
Nick
Nicholas C. Firth | PhD Student | Cancer Therapeutics
The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey
| SM2 5NG
T 020 8722 4033 | E nicholas.fi...@icr.ac.uk<mailto:nicholas.fi...@icr.ac.uk> |
W www.icr.ac.uk<http://www.icr.ac.uk/> | Twitter
@ICRnews<https://twitter.com/ICRnews>
Facebook
www.facebook.com/theinstituteofcancerresearch<http://www.facebook.com/theinstituteofcancerresearch>
Making the discoveries that defeat cancer
[cid:image001.gif@01CE053D.51D3C4E0]
On 1 May 2014, at 14:44, Greg Landrum
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:
Hi Nick,
On Thu, May 1, 2014 at 2:07 PM, Nicholas Firth
<nicholas.fi...@icr.ac.uk<mailto:nicholas.fi...@icr.ac.uk>> wrote:
I have (yet another) question about the handling of SMARTS. I have a set of
SMARTS (http://www.macinchem.org/reviews/pains/painsFilter.php) which I have
been using to perform PAINS filters but I've just discovered some strange
behaviour, I would expect a match to happen in the example below.
>>> p = Chem.MolFromSmarts('[#6]-[#6](=[#16])-[#1]')
>>> m = Chem.MolFromSmiles('CC=S')
>>> m.HasSubstructMatch(p)
False
That SMARTS, as formulated and being parsed is looking for an H atom that is
explicitly present in the molecule graph. Unless you're planning on running
AddHs() on everything, it won't work.
You can get the behavior you expect using the mergeHs argument to
MolFromSmarts():
In [3]: p = Chem.MolFromSmarts('[#6]-[#6](=[#16])-[#1]',mergeHs=True)
In [4]: m = Chem.MolFromSmiles('CC=S')
In [5]: m.HasSubstructMatch(p)
Out[5]: True
This removes the [#1] from the graph and adds a [!H0] query to the attached
carbon:
In [6]: Chem.MolToSmarts(p)
Out[6]: '[#6]-[#6&!H0]=[#16]'
This can be fixed using the alternative form of the SMARTS
>>> p2 = Chem.MolFromSmarts('[#6]-[#6H](=[#16])')
>>> m.HasSubstructMatch(p2)
True
This is roughly equivalent to what happens above. The difference is that it's
querying for a C that has a single H attached. In this case that's fine.
-greg
The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company
Limited by Guarantee, Registered in England under Company No. 534147 with its
Registered Office at 123 Old Brompton Road, London SW7 3RP.
This e-mail message is confidential and for use by the addressee only. If the
message is received by anyone other than the addressee, please return the
message to the sender by replying to it and then delete the message from your
computer and network.
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss