Re: [OpenBabel-Devel] Convenience functions

Stefano Forli Thu, 02 Mar 2017 15:19:22 -0800

Noel,

I'm not sure how to describe all the cases with a switch statement.
The original idea of using SMARTS came from your suggestion:


> The thing is, if I was matching a functional group, I would just use a
> SMARTS pattern. This is a general solution for 99% of cases, rather
> than having a function for each one. We have a set of SMARTS patterns
> for functional groups in the .txt file for FP3. The added benefit of a
> SMARTS pattern is that you have the match of each atom to each virtual
> atom of the pattern. For a function of OBAtom, it's just boolean true
> or false.

and you convinced me that's an excellent idea.

Hydrogen bond types can be: 0 (none), 1 (acceptor), 2 (donor), acceptor/donor 
(3), and 
there are several conditions that need to be checked.
For that, I'm creating SMARTS patterns which would reproduce the behavior of 
the current 
IsHBondAcceptor() which are stored in text datafile. As an example, this is a 
subset of 
the conditions in which oxygen is not considered an hbonding oxygen:

  # nitro-oxygen
  [#8]~N~[#8]           0 0

  # aromatic-bound oxygen
  [a]-[#8]-[a]                1 0

  # ester sp3 oxygen
  [*]-[#8]-[#6]=[#8]    1 0

  # sulfone
  [#6][#16;X4](=[#8])(=[#8])[#6]        2 0

  # hydroxyl
  [#8]-[#1]     0 3

The line format is (pattern, patternIdx, hbType). Basically, the typer checks 
the 
condition OBAtom.GetIdx() == pattern[patternIdx], then return type "hbType".
Between N, O, S and F, there are about 20 or so patterns to be tested, so I 
though doing 
the perception once and storing it would have been a good idea. On the other 
hand, I don't 
think performance is an issue for this kind of functions.

I have no idea how to implement this with a switch statement without having to 
do the 
bond/atom walking manually, as I did in the functions I contributed already.

Hope this helps,

S




On 03/02/2017 01:28 PM, Noel O'Boyle wrote:
> The presence of a class or not is an "implementation detail", as they
> say. That is, the user doesn't need to know anything about that.
> However, as I am currently replacing the use of SMARTS patterns for
> atom typers with switch statements, I'd recommend you to avoid using
> SMARTS for this in the first place and thus avoid any need for a class
> or a file to read things from. Maybe you should describe the exact
> problem you are solving with some examples and I can explain better.
>
> - Noel
>
> On 2 March 2017 at 19:51, Stefano Forli <[email protected]> wrote:
>> Noel,
>> thanks for the clarification, the only reason why I was looking at the lazy
>> mechanism was because of previous code.
>>
>> I'm OK with the simple function, although I think there's still a need for a
>> dedicated class behind which gets called to parse the different SMARTS
>> patterns from a data file and match them with the requested atom.
>>
>> S
>>
>>
>>
>>
>> On 03/02/2017 02:22 AM, Noel O'Boyle wrote:
>>>
>>> Hi Stefano,
>>>
>>> Sounds good. But the guidelines are not unfortunately the existing
>>> guide. I'm currently in the process of rewriting/removing as much of
>>> the croft as possible and the Lazy Evaluation mechanism is in my
>>> sights. It's a legacy from the original codebase. It'll be difficult
>>> to change this now, but at least we can avoid adding anything. I won't
>>> go too much into why this is, but suffice to say that OB developers
>>> spend some time working around the lazy evaluation or if they don't
>>> they triggeri it multiple times unneccessarily.
>>>
>>> In short, see if you can write a function that just takes an atom (or
>>> pair of atoms, or whatever) and returns an answer., e.g.
>>> OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
>>> one here.
>>>
>>> - Noel
>>>
>>> On 1 March 2017 at 23:17, Stefano Forli <[email protected]> wrote:
>>>>
>>>> Noel,
>>>> quite the contrary, I'm far from being pissed at you, by all means.
>>>> I like your suggestion, but I don't know if I can do it right away, there
>>>> are still a few things about the facade programming paradigm that escape
>>>> my
>>>> hobbist programming training.
>>>>
>>>> Following up on the discussion about the hydrogen bond, I had a quick
>>>> chat
>>>> with one of my students which is starting to write code based on
>>>> OpenBabel.
>>>> We took a shot at designing an OBHBondTyper class which should behave
>>>> similarly to OBAromaticTyper, and my idea was to store the information in
>>>> a
>>>> vector.
>>>>
>>>> If I'm not mistaken, though, the aromatic typer works in a lazy way that
>>>> looks similar to what you're describing for the OBResidueFacade, storing
>>>> the
>>>> information in a flag instead of a vector, is that correct? Or is the
>>>> flag
>>>> is just in vector? I tried looking for the definition of HasFlag(), but I
>>>> couldn't find it.
>>>>
>>>> Either way, I was thinking to start by writing this HB class (which I
>>>> probably understand better), try implementing the ob-standard lazy
>>>> evaluation mechanism, and integrate it the [Begin|End]Modify() process.
>>>>
>>>> We can do a git pull and then fix and adapt it according to the feedback
>>>> other developers suggestions.
>>>>
>>>> This would be a great chance for us to understand how to contribute code
>>>> that would integrate better and match the guidelines you guys follow.
>>>>
>>>> Thanks (for the patience),
>>>>
>>>> S
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 02/26/2017 02:05 AM, Noel O'Boyle wrote:
>>>>>
>>>>>
>>>>> Thanks Stefano for not getting (too!) pissed off with me. :-) We still
>>>>> don't have the clear API guidelines you asked for last time, but I
>>>>> think that these discussions are clarifying things for me at least,
>>>>> and we could probably start writing something up.
>>>>>
>>>>> I was thinking that your idea is similar to the rationale behind
>>>>> OBStereoFacade. Well, simply put, we could have an OBResidueFacade
>>>>> class which you initialise with a molecule and behind the scenes it
>>>>> then stores the Atom->PDBAtomId correspondance in a map or vector by
>>>>> iterating over the residues. Then you would have the same method you
>>>>> described, but it wouldn't do any iteration, but just look up the Id
>>>>> in the std::map or vector. So, it's a convenience class, separate from
>>>>> the core API, and it's efficient. If no-one disagrees and this makes
>>>>> sense to you, do you want to have a go at writing it?
>>>>>
>>>>> - Noel
>>>>>
>>>>> On 25 February 2017 at 23:03, Geoffrey Hutchison
>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>>
>>>>>>> About the PDB atom name, unfortunately I don't fully understand the
>>>>>>> performance issue implied in my suggestion, but from an interface
>>>>>>> point of
>>>>>>> view, it seems more intuitive to access an atom property from OBAtom
>>>>>>> instead
>>>>>>> of going back to the OBResidue (and pass the OBAtom).
>>>>>>
>>>>>>
>>>>>>
>>>>>> The OBAtom should already have a pointer to the OBResidue. If it can be
>>>>>> done with generic data in the OBAtom, that's fine, but I don't think I
>>>>>> want
>>>>>> to add data to each OBAtom. If I translate a nanotube or nanoparticle
>>>>>> (or
>>>>>> any other non-biomolecule) I'd have to store in memory a bunch of
>>>>>> residue
>>>>>> pointers that would never get used.
>>>>>>
>>>>>>> In this case, the best approach I can think of is to write a
>>>>>>> OBHBondTyper class to perceive the hbond character (similarly to what
>>>>>>> OBAromaticTyper does), then each atom should have a simple IsHBond()
>>>>>>> method
>>>>>>> that would return 0, 1 (donor), 2 (acceptor), 3(donor/acceptor),
>>>>>>> 4(...?).
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yes, this is the way to go to add "convenience functions." Noel's
>>>>>> suggestion is to keep the core API restrained, but that doesn't mean
>>>>>> there
>>>>>> can't be convenience classes or static methods. I think HBondTyper
>>>>>> would be
>>>>>> a good example, since there can (and should) be multiple models of
>>>>>> "what is
>>>>>> a hydrogen bond."
>>>>>>
>>>>>>> At the same time, I support Maciek's idea: useful functions that can
>>>>>>> streamline the use of scripting languages such as Python should be
>>>>>>> grouped
>>>>>>> and conserved.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm not sure why these can't be put into wrappers like Pybel. As is,
>>>>>> there is considerable "helper code" in Pybel, etc.
>>>>>>
>>>>>> -Geoff
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>  Stefano Forli, PhD
>>>>
>>>>  Assistant Professor of ISCB
>>>>  Molecular Graphics Laboratory
>>>>
>>>>  Dept. of Integrative Structural
>>>>  and Computational Biology, MB-112A
>>>>  The Scripps Research Institute
>>>>  10550  North Torrey Pines Road
>>>>  La Jolla,  CA 92037-1000,  USA.
>>>>
>>>>     tel: +1 (858)784-2055
>>>>     fax: +1 (858)784-2860
>>>>     email: [email protected]
>>>>     http://www.scripps.edu/~forli/
>>
>>
>> --
>>
>>  Stefano Forli, PhD
>>
>>  Assistant Professor of ISCB
>>  Molecular Graphics Laboratory
>>
>>  Dept. of Integrative Structural
>>  and Computational Biology, MB-112A
>>  The Scripps Research Institute
>>  10550  North Torrey Pines Road
>>  La Jolla,  CA 92037-1000,  USA.
>>
>>     tel: +1 (858)784-2055
>>     fax: +1 (858)784-2860
>>     email: [email protected]
>>     http://www.scripps.edu/~forli/

-- 

  Stefano Forli, PhD

  Assistant Professor of ISCB
  Molecular Graphics Laboratory

  Dept. of Integrative Structural
  and Computational Biology, MB-112A
  The Scripps Research Institute
  10550  North Torrey Pines Road
  La Jolla,  CA 92037-1000,  USA.

     tel: +1 (858)784-2055
     fax: +1 (858)784-2860
     email: [email protected]
     http://www.scripps.edu/~forli/

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Re: [OpenBabel-Devel] Convenience functions

Reply via email to