On 15/11/2010 14:48, Fredrik Wallner wrote:
> Hi!
>
> Thank you for the answers Chris! I can confirm that the workaround solved my 
> problem.
> I still have to say that I find it unintuitive that a SMARTS with an explicit 
> H wouldn't match a compound with an implicit H. Even though it is implicit, 
> it is there in the actual compound.

I agree that explicit H in SMARTS should match implicit H in a 
molecule. Looking into the code, I see that the SMARTS parser will 
automatically make the hydrogens explicit on a copy of the molecule if 
it finds '[H]' in the SMARTS. But extending this to '#1' caused 
complications - the SMARTS parser is extensively used internally - so 
I have added an automatic call to AddHydrogens()in the -s option code. 
This modifies the molecule, but that would not normally be a problem.

Your original query now works (in the development code).

Chris

> 15 nov 2010 kl. 15.29 skrev Chris Morley:
>
>> On 15/11/2010 12:17, Fredrik Wallner wrote:
>>> Hi all!
>>>
>>> I'm having some problems with SMART matching...
>>>
>>> # This is the query I would like to have worked, without modifications
>
>>> on either molecule or SMARTS
>>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan
>>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]'
>>> 0 molecules converted
>>>
>>> # Here the explicit hydrogen is removed from the query
>>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan
>>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]'
>>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O
>>> 1 molecule converted
>>>
>>> # Here the hydrogen is required, but implicit are allowed
>>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan
>>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R;H]'
>>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O
>>> 1 molecule converted
>>>
>>> # Here the molecule to be tested has had an added hydrogen
>>> $ echo "OCCOc1ccccc1[CH]=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan
>>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]'
>>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O
>>> 1 molecule converted
>>
>> Be careful, according to a recent post by Andrew Dalke, this is not
>> explicit hydrogen, just explicitly stated implicit hydrogen. (At least
>> one OB developer did not realize the difference, and at least one does
>> not think it should matter.)
>>>
>>> # I also tried with the -h switch to add hydrogens to the molecule
>>> with no luck (does it add the hydrogens before or after the SMARTS
>>> matching?
>>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan -h
>>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]'
>>> 0 molecules converted
>>>
>>> I find it kind of unintuitive that query no 1 and 3 gives different
>>> results and I would have thought that all should match.
>>>
>>> Have I missed something or do anyone among you have an idea how to
>>> solve my problem without rewriting every SMART...
>>
>> The proper solution *should* be to add hydrogen with a -h option if
>> the SMARTS query has explicit hydrogen.
>>
>> The order the options are applied is (which needs recording in the
>> documentation):
>>
>> ops in alphabetical order. These are the plugin options.
>> -h -b -r etc.              Most hardwired options.
>> --filter
>> -s -v                      The original versions of the SMARTS filter.
>>
>> So in versions earlier than 2.3.0 -h would add hydrogens before the
>> SMARTS test in -s. But In 2.3.0 there is an enhanced -s option that is
>> an op and which overrides the old version. Because it is an op it is
>> applied *before* the hydrogens are added by -h.
>>
>> On a Windows install giving plugin_ops.obf a different extension
>> removes the new -s and allows query 5 to succeed, as expected.
>>
>> A workaround is to add the hydrogens first in a format which takes
>> explicit hydrogen:
>>
>> obabel -:OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O -osdf -h | obabel -isdf
>> -ocan
>> -s"[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]"
>>
>> Sorry that the new -s option was not completely thought through. It
>> would give extra flexibility if the order of the options on the
>> command line was significant, but that would require significant code
>> changes.
>>
>> Chris

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to