On 15/11/2010 14:48, Fredrik Wallner wrote: > Hi! > > Thank you for the answers Chris! I can confirm that the workaround solved my > problem. > I still have to say that I find it unintuitive that a SMARTS with an explicit > H wouldn't match a compound with an implicit H. Even though it is implicit, > it is there in the actual compound.
I agree that explicit H in SMARTS should match implicit H in a molecule. Looking into the code, I see that the SMARTS parser will automatically make the hydrogens explicit on a copy of the molecule if it finds '[H]' in the SMARTS. But extending this to '#1' caused complications - the SMARTS parser is extensively used internally - so I have added an automatic call to AddHydrogens()in the -s option code. This modifies the molecule, but that would not normally be a problem. Your original query now works (in the development code). Chris > 15 nov 2010 kl. 15.29 skrev Chris Morley: > >> On 15/11/2010 12:17, Fredrik Wallner wrote: >>> Hi all! >>> >>> I'm having some problems with SMART matching... >>> >>> # This is the query I would like to have worked, without modifications > >>> on either molecule or SMARTS >>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan >>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]' >>> 0 molecules converted >>> >>> # Here the explicit hydrogen is removed from the query >>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan >>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]' >>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O >>> 1 molecule converted >>> >>> # Here the hydrogen is required, but implicit are allowed >>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan >>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R;H]' >>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O >>> 1 molecule converted >>> >>> # Here the molecule to be tested has had an added hydrogen >>> $ echo "OCCOc1ccccc1[CH]=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan >>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]' >>> OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O >>> 1 molecule converted >> >> Be careful, according to a recent post by Andrew Dalke, this is not >> explicit hydrogen, just explicitly stated implicit hydrogen. (At least >> one OB developer did not realize the difference, and at least one does >> not think it should matter.) >>> >>> # I also tried with the -h switch to add hydrogens to the molecule >>> with no luck (does it add the hydrogens before or after the SMARTS >>> matching? >>> $ echo "OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O" | obabel -ismi -ocan -h >>> -s'[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]' >>> 0 molecules converted >>> >>> I find it kind of unintuitive that query no 1 and 3 gives different >>> results and I would have thought that all should match. >>> >>> Have I missed something or do anyone among you have an idea how to >>> solve my problem without rewriting every SMART... >> >> The proper solution *should* be to add hydrogen with a -h option if >> the SMARTS query has explicit hydrogen. >> >> The order the options are applied is (which needs recording in the >> documentation): >> >> ops in alphabetical order. These are the plugin options. >> -h -b -r etc. Most hardwired options. >> --filter >> -s -v The original versions of the SMARTS filter. >> >> So in versions earlier than 2.3.0 -h would add hydrogens before the >> SMARTS test in -s. But In 2.3.0 there is an enhanced -s option that is >> an op and which overrides the old version. Because it is an op it is >> applied *before* the hydrogens are added by -h. >> >> On a Windows install giving plugin_ops.obf a different extension >> removes the new -s and allows query 5 to succeed, as expected. >> >> A workaround is to add the hydrogens first in a format which takes >> explicit hydrogen: >> >> obabel -:OCCOc1ccccc1C=C1C(=O)NC(=S)NC1=O -osdf -h | obabel -isdf >> -ocan >> -s"[#6]-1(-[#6](~[!#6&!#1]~[#6]-[!#6&!#1]-[#6]-1=[!#6&!#1])~[!#6&!#1])=[#6;!R]-[#1]" >> >> Sorry that the new -s option was not completely thought through. It >> would give extra flexibility if the order of the options on the >> command line was significant, but that would require significant code >> changes. >> >> Chris ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss