Re: [OpenBabel-Devel] Convenience functions

2017-03-06 Thread Stefano Forli
Hi Noel,
I think I understand your idea, and it's totally compatible with the current 
design we're 
following. Indeed, the API is going to be for the atom.

In fact, in the actual syntax of the datafile I sent before has an extra column 
(which I 
omitted) that specifies the element of the pattern, which is used as a key for 
the hash 
storing the patterns.
The advice on using the first atom is clever, although, I think the element 
still needs to 
be specified (unless you think there's an unambiguous way to get that from the 
pattern).

I agree that it's a good idea to move away from text files, although there's a 
certain 
degree of convenience in storing and managing large data sets (e.g., force 
field, atom 
type parameters) outside the actual source code

One possible compromise may be to keep the text files as part of the source 
'package' and 
generate .h files on the fly at compilation time. I've seen this pattern in 
several 
programs, and I think it may be convenient.
On the other hand, I don't know how much work it is to write and maintain such 
on-the-fly 
scripts/programs, so feel free to ignore the suggestion if that's the case.

I hope to come back with a working prototype soon.
Thanks for the guidance,

S


On 03/06/2017 07:08 AM, Noel O'Boyle wrote:
> The previous discussion was about having such a function in the public API. 
> But anyway,
> let's say you do use smarts, you can still combine it with a switch statement 
> on element
> and maybe on something else so that you don't end up testing the oxygen 
> patterns against a
> nitrogen (this will already make it three times faster or so). Regarding your 
> specific
> examples I think you just need to rewrite so that the first atom is the query 
> atom. Also
> avoid using an explicit hydrogen - this won't work like you expect.
>
> I'm hoping to do away with dependencies on text files (as this complicates 
> having an all
> in one binary) at some point so putting your patterns in a const char array 
> would be
> welcome. Doing the perception once and then storing it may indeed be a good 
> idea but the
> user can do this without us getting involved. Maintaining perceived data is a 
> source of
> errors and inefficiencies in the current codebase.
>
> Noel
>
> On Thursday, 2 March 2017, Stefano Forli  > wrote:
>
> Noel,
>
> I'm not sure how to describe all the cases with a switch statement.
> The original idea of using SMARTS came from your suggestion:
>
> The thing is, if I was matching a functional group, I would just use a
> SMARTS pattern. This is a general solution for 99% of cases, rather
> than having a function for each one. We have a set of SMARTS patterns
> for functional groups in the .txt file for FP3. The added benefit of a
> SMARTS pattern is that you have the match of each atom to each virtual
> atom of the pattern. For a function of OBAtom, it's just boolean true
> or false.
>
>
> and you convinced me that's an excellent idea.
>
> Hydrogen bond types can be: 0 (none), 1 (acceptor), 2 (donor), 
> acceptor/donor (3), and
> there are several conditions that need to be checked.
> For that, I'm creating SMARTS patterns which would reproduce the behavior 
> of the
> current IsHBondAcceptor() which are stored in text datafile. As an 
> example, this is a
> subset of the conditions in which oxygen is not considered an hbonding 
> oxygen:
>
>  # nitro-oxygen
>  [#8]~N~[#8]   0 0
>
>  # aromatic-bound oxygen
>  [a]-[#8]-[a]1 0
>
>  # ester sp3 oxygen
>  [*]-[#8]-[#6]=[#8] 1 0
>
>  # sulfone
>  [#6][#16;X4](=[#8])(=[#8])[#6] 2 0
>
>  # hydroxyl
>  [#8]-[#1]  0 3
>
> The line format is (pattern, patternIdx, hbType). Basically, the typer 
> checks the
> condition OBAtom.GetIdx() == pattern[patternIdx], then return type 
> "hbType".
> Between N, O, S and F, there are about 20 or so patterns to be tested, so 
> I though
> doing the perception once and storing it would have been a good idea. On 
> the other
> hand, I don't think performance is an issue for this kind of functions.
>
> I have no idea how to implement this with a switch statement without 
> having to do the
> bond/atom walking manually, as I did in the functions I contributed 
> already.
>
> Hope this helps,
>
> S
>
>
>
>
> On 03/02/2017 01:28 PM, Noel O'Boyle wrote:
>
> The presence of a class or not is an "implementation detail", as they
> say. That is, the user doesn't need to know anything about that.
> However, as I am currently replacing the use of SMARTS patterns for
> atom typers with switch statements, I'd recommend you to avoid using
> SMARTS for this in the first place and thus avoid any need for a class
> or a file to read things from. Maybe you should describe the exact

Re: [OpenBabel-Devel] Convenience functions

2017-03-06 Thread Noel O'Boyle
The previous discussion was about having such a function in the public API.
But anyway, let's say you do use smarts, you can still combine it with a
switch statement on element and maybe on something else so that you don't
end up testing the oxygen patterns against a nitrogen (this will already
make it three times faster or so). Regarding your specific examples I think
you just need to rewrite so that the first atom is the query atom. Also
avoid using an explicit hydrogen - this won't work like you expect.

I'm hoping to do away with dependencies on text files (as this complicates
having an all in one binary) at some point so putting your patterns in a
const char array would be welcome. Doing the perception once and then
storing it may indeed be a good idea but the user can do this without us
getting involved. Maintaining perceived data is a source of errors and
inefficiencies in the current codebase.

Noel

On Thursday, 2 March 2017, Stefano Forli  wrote:

> Noel,
>
> I'm not sure how to describe all the cases with a switch statement.
> The original idea of using SMARTS came from your suggestion:
>
> The thing is, if I was matching a functional group, I would just use a
>> SMARTS pattern. This is a general solution for 99% of cases, rather
>> than having a function for each one. We have a set of SMARTS patterns
>> for functional groups in the .txt file for FP3. The added benefit of a
>> SMARTS pattern is that you have the match of each atom to each virtual
>> atom of the pattern. For a function of OBAtom, it's just boolean true
>> or false.
>>
>
> and you convinced me that's an excellent idea.
>
> Hydrogen bond types can be: 0 (none), 1 (acceptor), 2 (donor),
> acceptor/donor (3), and there are several conditions that need to be
> checked.
> For that, I'm creating SMARTS patterns which would reproduce the behavior
> of the current IsHBondAcceptor() which are stored in text datafile. As an
> example, this is a subset of the conditions in which oxygen is not
> considered an hbonding oxygen:
>
>  # nitro-oxygen
>  [#8]~N~[#8]   0 0
>
>  # aromatic-bound oxygen
>  [a]-[#8]-[a]1 0
>
>  # ester sp3 oxygen
>  [*]-[#8]-[#6]=[#8] 1 0
>
>  # sulfone
>  [#6][#16;X4](=[#8])(=[#8])[#6] 2 0
>
>  # hydroxyl
>  [#8]-[#1]  0 3
>
> The line format is (pattern, patternIdx, hbType). Basically, the typer
> checks the condition OBAtom.GetIdx() == pattern[patternIdx], then return
> type "hbType".
> Between N, O, S and F, there are about 20 or so patterns to be tested, so
> I though doing the perception once and storing it would have been a good
> idea. On the other hand, I don't think performance is an issue for this
> kind of functions.
>
> I have no idea how to implement this with a switch statement without
> having to do the bond/atom walking manually, as I did in the functions I
> contributed already.
>
> Hope this helps,
>
> S
>
>
>
>
> On 03/02/2017 01:28 PM, Noel O'Boyle wrote:
>
>> The presence of a class or not is an "implementation detail", as they
>> say. That is, the user doesn't need to know anything about that.
>> However, as I am currently replacing the use of SMARTS patterns for
>> atom typers with switch statements, I'd recommend you to avoid using
>> SMARTS for this in the first place and thus avoid any need for a class
>> or a file to read things from. Maybe you should describe the exact
>> problem you are solving with some examples and I can explain better.
>>
>> - Noel
>>
>> On 2 March 2017 at 19:51, Stefano Forli  wrote:
>>
>>> Noel,
>>> thanks for the clarification, the only reason why I was looking at the
>>> lazy
>>> mechanism was because of previous code.
>>>
>>> I'm OK with the simple function, although I think there's still a need
>>> for a
>>> dedicated class behind which gets called to parse the different SMARTS
>>> patterns from a data file and match them with the requested atom.
>>>
>>> S
>>>
>>>
>>>
>>>
>>> On 03/02/2017 02:22 AM, Noel O'Boyle wrote:
>>>

 Hi Stefano,

 Sounds good. But the guidelines are not unfortunately the existing
 guide. I'm currently in the process of rewriting/removing as much of
 the croft as possible and the Lazy Evaluation mechanism is in my
 sights. It's a legacy from the original codebase. It'll be difficult
 to change this now, but at least we can avoid adding anything. I won't
 go too much into why this is, but suffice to say that OB developers
 spend some time working around the lazy evaluation or if they don't
 they triggeri it multiple times unneccessarily.

 In short, see if you can write a function that just takes an atom (or
 pair of atoms, or whatever) and returns an answer., e.g.
 OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
 one here.

 - Noel

 On 1 March 2017 at 23:17, Stefano Forli  wrote:

>
> Noel,
> quite the contrary, I'm far from being 

Re: [OpenBabel-Devel] Convenience functions

2017-03-02 Thread Stefano Forli
Noel,

I'm not sure how to describe all the cases with a switch statement.
The original idea of using SMARTS came from your suggestion:

> The thing is, if I was matching a functional group, I would just use a
> SMARTS pattern. This is a general solution for 99% of cases, rather
> than having a function for each one. We have a set of SMARTS patterns
> for functional groups in the .txt file for FP3. The added benefit of a
> SMARTS pattern is that you have the match of each atom to each virtual
> atom of the pattern. For a function of OBAtom, it's just boolean true
> or false.

and you convinced me that's an excellent idea.

Hydrogen bond types can be: 0 (none), 1 (acceptor), 2 (donor), acceptor/donor 
(3), and 
there are several conditions that need to be checked.
For that, I'm creating SMARTS patterns which would reproduce the behavior of 
the current 
IsHBondAcceptor() which are stored in text datafile. As an example, this is a 
subset of 
the conditions in which oxygen is not considered an hbonding oxygen:

  # nitro-oxygen
  [#8]~N~[#8]   0 0

  # aromatic-bound oxygen
  [a]-[#8]-[a]1 0

  # ester sp3 oxygen
  [*]-[#8]-[#6]=[#8]1 0

  # sulfone
  [#6][#16;X4](=[#8])(=[#8])[#6]2 0

  # hydroxyl
  [#8]-[#1] 0 3

The line format is (pattern, patternIdx, hbType). Basically, the typer checks 
the 
condition OBAtom.GetIdx() == pattern[patternIdx], then return type "hbType".
Between N, O, S and F, there are about 20 or so patterns to be tested, so I 
though doing 
the perception once and storing it would have been a good idea. On the other 
hand, I don't 
think performance is an issue for this kind of functions.

I have no idea how to implement this with a switch statement without having to 
do the 
bond/atom walking manually, as I did in the functions I contributed already.

Hope this helps,

S




On 03/02/2017 01:28 PM, Noel O'Boyle wrote:
> The presence of a class or not is an "implementation detail", as they
> say. That is, the user doesn't need to know anything about that.
> However, as I am currently replacing the use of SMARTS patterns for
> atom typers with switch statements, I'd recommend you to avoid using
> SMARTS for this in the first place and thus avoid any need for a class
> or a file to read things from. Maybe you should describe the exact
> problem you are solving with some examples and I can explain better.
>
> - Noel
>
> On 2 March 2017 at 19:51, Stefano Forli  wrote:
>> Noel,
>> thanks for the clarification, the only reason why I was looking at the lazy
>> mechanism was because of previous code.
>>
>> I'm OK with the simple function, although I think there's still a need for a
>> dedicated class behind which gets called to parse the different SMARTS
>> patterns from a data file and match them with the requested atom.
>>
>> S
>>
>>
>>
>>
>> On 03/02/2017 02:22 AM, Noel O'Boyle wrote:
>>>
>>> Hi Stefano,
>>>
>>> Sounds good. But the guidelines are not unfortunately the existing
>>> guide. I'm currently in the process of rewriting/removing as much of
>>> the croft as possible and the Lazy Evaluation mechanism is in my
>>> sights. It's a legacy from the original codebase. It'll be difficult
>>> to change this now, but at least we can avoid adding anything. I won't
>>> go too much into why this is, but suffice to say that OB developers
>>> spend some time working around the lazy evaluation or if they don't
>>> they triggeri it multiple times unneccessarily.
>>>
>>> In short, see if you can write a function that just takes an atom (or
>>> pair of atoms, or whatever) and returns an answer., e.g.
>>> OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
>>> one here.
>>>
>>> - Noel
>>>
>>> On 1 March 2017 at 23:17, Stefano Forli  wrote:

 Noel,
 quite the contrary, I'm far from being pissed at you, by all means.
 I like your suggestion, but I don't know if I can do it right away, there
 are still a few things about the facade programming paradigm that escape
 my
 hobbist programming training.

 Following up on the discussion about the hydrogen bond, I had a quick
 chat
 with one of my students which is starting to write code based on
 OpenBabel.
 We took a shot at designing an OBHBondTyper class which should behave
 similarly to OBAromaticTyper, and my idea was to store the information in
 a
 vector.

 If I'm not mistaken, though, the aromatic typer works in a lazy way that
 looks similar to what you're describing for the OBResidueFacade, storing
 the
 information in a flag instead of a vector, is that correct? Or is the
 flag
 is just in vector? I tried looking for the definition of HasFlag(), but I
 couldn't find it.

 Either way, I was thinking to start by writing this HB class (which I
 probably understand better), try implementing the ob-standard lazy
 evaluation mechanism, and integrate it 

Re: [OpenBabel-Devel] Convenience functions

2017-03-02 Thread Noel O'Boyle
The presence of a class or not is an "implementation detail", as they
say. That is, the user doesn't need to know anything about that.
However, as I am currently replacing the use of SMARTS patterns for
atom typers with switch statements, I'd recommend you to avoid using
SMARTS for this in the first place and thus avoid any need for a class
or a file to read things from. Maybe you should describe the exact
problem you are solving with some examples and I can explain better.

- Noel

On 2 March 2017 at 19:51, Stefano Forli  wrote:
> Noel,
> thanks for the clarification, the only reason why I was looking at the lazy
> mechanism was because of previous code.
>
> I'm OK with the simple function, although I think there's still a need for a
> dedicated class behind which gets called to parse the different SMARTS
> patterns from a data file and match them with the requested atom.
>
> S
>
>
>
>
> On 03/02/2017 02:22 AM, Noel O'Boyle wrote:
>>
>> Hi Stefano,
>>
>> Sounds good. But the guidelines are not unfortunately the existing
>> guide. I'm currently in the process of rewriting/removing as much of
>> the croft as possible and the Lazy Evaluation mechanism is in my
>> sights. It's a legacy from the original codebase. It'll be difficult
>> to change this now, but at least we can avoid adding anything. I won't
>> go too much into why this is, but suffice to say that OB developers
>> spend some time working around the lazy evaluation or if they don't
>> they triggeri it multiple times unneccessarily.
>>
>> In short, see if you can write a function that just takes an atom (or
>> pair of atoms, or whatever) and returns an answer., e.g.
>> OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
>> one here.
>>
>> - Noel
>>
>> On 1 March 2017 at 23:17, Stefano Forli  wrote:
>>>
>>> Noel,
>>> quite the contrary, I'm far from being pissed at you, by all means.
>>> I like your suggestion, but I don't know if I can do it right away, there
>>> are still a few things about the facade programming paradigm that escape
>>> my
>>> hobbist programming training.
>>>
>>> Following up on the discussion about the hydrogen bond, I had a quick
>>> chat
>>> with one of my students which is starting to write code based on
>>> OpenBabel.
>>> We took a shot at designing an OBHBondTyper class which should behave
>>> similarly to OBAromaticTyper, and my idea was to store the information in
>>> a
>>> vector.
>>>
>>> If I'm not mistaken, though, the aromatic typer works in a lazy way that
>>> looks similar to what you're describing for the OBResidueFacade, storing
>>> the
>>> information in a flag instead of a vector, is that correct? Or is the
>>> flag
>>> is just in vector? I tried looking for the definition of HasFlag(), but I
>>> couldn't find it.
>>>
>>> Either way, I was thinking to start by writing this HB class (which I
>>> probably understand better), try implementing the ob-standard lazy
>>> evaluation mechanism, and integrate it the [Begin|End]Modify() process.
>>>
>>> We can do a git pull and then fix and adapt it according to the feedback
>>> other developers suggestions.
>>>
>>> This would be a great chance for us to understand how to contribute code
>>> that would integrate better and match the guidelines you guys follow.
>>>
>>> Thanks (for the patience),
>>>
>>> S
>>>
>>>
>>>
>>>
>>>
>>> On 02/26/2017 02:05 AM, Noel O'Boyle wrote:


 Thanks Stefano for not getting (too!) pissed off with me. :-) We still
 don't have the clear API guidelines you asked for last time, but I
 think that these discussions are clarifying things for me at least,
 and we could probably start writing something up.

 I was thinking that your idea is similar to the rationale behind
 OBStereoFacade. Well, simply put, we could have an OBResidueFacade
 class which you initialise with a molecule and behind the scenes it
 then stores the Atom->PDBAtomId correspondance in a map or vector by
 iterating over the residues. Then you would have the same method you
 described, but it wouldn't do any iteration, but just look up the Id
 in the std::map or vector. So, it's a convenience class, separate from
 the core API, and it's efficient. If no-one disagrees and this makes
 sense to you, do you want to have a go at writing it?

 - Noel

 On 25 February 2017 at 23:03, Geoffrey Hutchison
  wrote:
>>
>>
>> About the PDB atom name, unfortunately I don't fully understand the
>> performance issue implied in my suggestion, but from an interface
>> point of
>> view, it seems more intuitive to access an atom property from OBAtom
>> instead
>> of going back to the OBResidue (and pass the OBAtom).
>
>
>
> The OBAtom should already have a pointer to the OBResidue. If it can be
> done with generic data in the OBAtom, that's fine, but I don't think I
> want
> to 

Re: [OpenBabel-Devel] Convenience functions

2017-03-02 Thread Stefano Forli
Noel,
thanks for the clarification, the only reason why I was looking at the lazy 
mechanism was 
because of previous code.

I'm OK with the simple function, although I think there's still a need for a 
dedicated 
class behind which gets called to parse the different SMARTS patterns from a 
data file and 
match them with the requested atom.

S



On 03/02/2017 02:22 AM, Noel O'Boyle wrote:
> Hi Stefano,
>
> Sounds good. But the guidelines are not unfortunately the existing
> guide. I'm currently in the process of rewriting/removing as much of
> the croft as possible and the Lazy Evaluation mechanism is in my
> sights. It's a legacy from the original codebase. It'll be difficult
> to change this now, but at least we can avoid adding anything. I won't
> go too much into why this is, but suffice to say that OB developers
> spend some time working around the lazy evaluation or if they don't
> they triggeri it multiple times unneccessarily.
>
> In short, see if you can write a function that just takes an atom (or
> pair of atoms, or whatever) and returns an answer., e.g.
> OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
> one here.
>
> - Noel
>
> On 1 March 2017 at 23:17, Stefano Forli  wrote:
>> Noel,
>> quite the contrary, I'm far from being pissed at you, by all means.
>> I like your suggestion, but I don't know if I can do it right away, there
>> are still a few things about the facade programming paradigm that escape my
>> hobbist programming training.
>>
>> Following up on the discussion about the hydrogen bond, I had a quick chat
>> with one of my students which is starting to write code based on OpenBabel.
>> We took a shot at designing an OBHBondTyper class which should behave
>> similarly to OBAromaticTyper, and my idea was to store the information in a
>> vector.
>>
>> If I'm not mistaken, though, the aromatic typer works in a lazy way that
>> looks similar to what you're describing for the OBResidueFacade, storing the
>> information in a flag instead of a vector, is that correct? Or is the flag
>> is just in vector? I tried looking for the definition of HasFlag(), but I
>> couldn't find it.
>>
>> Either way, I was thinking to start by writing this HB class (which I
>> probably understand better), try implementing the ob-standard lazy
>> evaluation mechanism, and integrate it the [Begin|End]Modify() process.
>>
>> We can do a git pull and then fix and adapt it according to the feedback
>> other developers suggestions.
>>
>> This would be a great chance for us to understand how to contribute code
>> that would integrate better and match the guidelines you guys follow.
>>
>> Thanks (for the patience),
>>
>> S
>>
>>
>>
>>
>>
>> On 02/26/2017 02:05 AM, Noel O'Boyle wrote:
>>>
>>> Thanks Stefano for not getting (too!) pissed off with me. :-) We still
>>> don't have the clear API guidelines you asked for last time, but I
>>> think that these discussions are clarifying things for me at least,
>>> and we could probably start writing something up.
>>>
>>> I was thinking that your idea is similar to the rationale behind
>>> OBStereoFacade. Well, simply put, we could have an OBResidueFacade
>>> class which you initialise with a molecule and behind the scenes it
>>> then stores the Atom->PDBAtomId correspondance in a map or vector by
>>> iterating over the residues. Then you would have the same method you
>>> described, but it wouldn't do any iteration, but just look up the Id
>>> in the std::map or vector. So, it's a convenience class, separate from
>>> the core API, and it's efficient. If no-one disagrees and this makes
>>> sense to you, do you want to have a go at writing it?
>>>
>>> - Noel
>>>
>>> On 25 February 2017 at 23:03, Geoffrey Hutchison
>>>  wrote:
>
> About the PDB atom name, unfortunately I don't fully understand the
> performance issue implied in my suggestion, but from an interface point of
> view, it seems more intuitive to access an atom property from OBAtom 
> instead
> of going back to the OBResidue (and pass the OBAtom).


 The OBAtom should already have a pointer to the OBResidue. If it can be
 done with generic data in the OBAtom, that's fine, but I don't think I want
 to add data to each OBAtom. If I translate a nanotube or nanoparticle (or
 any other non-biomolecule) I'd have to store in memory a bunch of residue
 pointers that would never get used.

> In this case, the best approach I can think of is to write a
> OBHBondTyper class to perceive the hbond character (similarly to what
> OBAromaticTyper does), then each atom should have a simple IsHBond() 
> method
> that would return 0, 1 (donor), 2 (acceptor), 3(donor/acceptor), 4(...?).


 Yes, this is the way to go to add "convenience functions." Noel's
 suggestion is to keep the core API restrained, but that doesn't mean there
 can't be convenience classes or static 

Re: [OpenBabel-Devel] Convenience functions

2017-03-02 Thread Noel O'Boyle
Typo: "But the guidelines are not unfortunately the existing code. "

On 2 March 2017 at 10:22, Noel O'Boyle  wrote:
> Hi Stefano,
>
> Sounds good. But the guidelines are not unfortunately the existing
> guide. I'm currently in the process of rewriting/removing as much of
> the croft as possible and the Lazy Evaluation mechanism is in my
> sights. It's a legacy from the original codebase. It'll be difficult
> to change this now, but at least we can avoid adding anything. I won't
> go too much into why this is, but suffice to say that OB developers
> spend some time working around the lazy evaluation or if they don't
> they triggeri it multiple times unneccessarily.
>
> In short, see if you can write a function that just takes an atom (or
> pair of atoms, or whatever) and returns an answer., e.g.
> OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
> one here.
>
> - Noel
>
> On 1 March 2017 at 23:17, Stefano Forli  wrote:
>> Noel,
>> quite the contrary, I'm far from being pissed at you, by all means.
>> I like your suggestion, but I don't know if I can do it right away, there
>> are still a few things about the facade programming paradigm that escape my
>> hobbist programming training.
>>
>> Following up on the discussion about the hydrogen bond, I had a quick chat
>> with one of my students which is starting to write code based on OpenBabel.
>> We took a shot at designing an OBHBondTyper class which should behave
>> similarly to OBAromaticTyper, and my idea was to store the information in a
>> vector.
>>
>> If I'm not mistaken, though, the aromatic typer works in a lazy way that
>> looks similar to what you're describing for the OBResidueFacade, storing the
>> information in a flag instead of a vector, is that correct? Or is the flag
>> is just in vector? I tried looking for the definition of HasFlag(), but I
>> couldn't find it.
>>
>> Either way, I was thinking to start by writing this HB class (which I
>> probably understand better), try implementing the ob-standard lazy
>> evaluation mechanism, and integrate it the [Begin|End]Modify() process.
>>
>> We can do a git pull and then fix and adapt it according to the feedback
>> other developers suggestions.
>>
>> This would be a great chance for us to understand how to contribute code
>> that would integrate better and match the guidelines you guys follow.
>>
>> Thanks (for the patience),
>>
>> S
>>
>>
>>
>>
>>
>> On 02/26/2017 02:05 AM, Noel O'Boyle wrote:
>>>
>>> Thanks Stefano for not getting (too!) pissed off with me. :-) We still
>>> don't have the clear API guidelines you asked for last time, but I
>>> think that these discussions are clarifying things for me at least,
>>> and we could probably start writing something up.
>>>
>>> I was thinking that your idea is similar to the rationale behind
>>> OBStereoFacade. Well, simply put, we could have an OBResidueFacade
>>> class which you initialise with a molecule and behind the scenes it
>>> then stores the Atom->PDBAtomId correspondance in a map or vector by
>>> iterating over the residues. Then you would have the same method you
>>> described, but it wouldn't do any iteration, but just look up the Id
>>> in the std::map or vector. So, it's a convenience class, separate from
>>> the core API, and it's efficient. If no-one disagrees and this makes
>>> sense to you, do you want to have a go at writing it?
>>>
>>> - Noel
>>>
>>> On 25 February 2017 at 23:03, Geoffrey Hutchison
>>>  wrote:
>
> About the PDB atom name, unfortunately I don't fully understand the
> performance issue implied in my suggestion, but from an interface point of
> view, it seems more intuitive to access an atom property from OBAtom 
> instead
> of going back to the OBResidue (and pass the OBAtom).


 The OBAtom should already have a pointer to the OBResidue. If it can be
 done with generic data in the OBAtom, that's fine, but I don't think I want
 to add data to each OBAtom. If I translate a nanotube or nanoparticle (or
 any other non-biomolecule) I'd have to store in memory a bunch of residue
 pointers that would never get used.

> In this case, the best approach I can think of is to write a
> OBHBondTyper class to perceive the hbond character (similarly to what
> OBAromaticTyper does), then each atom should have a simple IsHBond() 
> method
> that would return 0, 1 (donor), 2 (acceptor), 3(donor/acceptor), 4(...?).


 Yes, this is the way to go to add "convenience functions." Noel's
 suggestion is to keep the core API restrained, but that doesn't mean there
 can't be convenience classes or static methods. I think HBondTyper would be
 a good example, since there can (and should) be multiple models of "what is
 a hydrogen bond."

> At the same time, I support Maciek's idea: useful functions that can
> streamline the use of scripting 

Re: [OpenBabel-Devel] Convenience functions

2017-03-02 Thread Noel O'Boyle
Hi Stefano,

Sounds good. But the guidelines are not unfortunately the existing
guide. I'm currently in the process of rewriting/removing as much of
the croft as possible and the Lazy Evaluation mechanism is in my
sights. It's a legacy from the original codebase. It'll be difficult
to change this now, but at least we can avoid adding anything. I won't
go too much into why this is, but suffice to say that OB developers
spend some time working around the lazy evaluation or if they don't
they triggeri it multiple times unneccessarily.

In short, see if you can write a function that just takes an atom (or
pair of atoms, or whatever) and returns an answer., e.g.
OBAtomGetHBondType(OBAtom*). The simpler solution really is the best
one here.

- Noel

On 1 March 2017 at 23:17, Stefano Forli  wrote:
> Noel,
> quite the contrary, I'm far from being pissed at you, by all means.
> I like your suggestion, but I don't know if I can do it right away, there
> are still a few things about the facade programming paradigm that escape my
> hobbist programming training.
>
> Following up on the discussion about the hydrogen bond, I had a quick chat
> with one of my students which is starting to write code based on OpenBabel.
> We took a shot at designing an OBHBondTyper class which should behave
> similarly to OBAromaticTyper, and my idea was to store the information in a
> vector.
>
> If I'm not mistaken, though, the aromatic typer works in a lazy way that
> looks similar to what you're describing for the OBResidueFacade, storing the
> information in a flag instead of a vector, is that correct? Or is the flag
> is just in vector? I tried looking for the definition of HasFlag(), but I
> couldn't find it.
>
> Either way, I was thinking to start by writing this HB class (which I
> probably understand better), try implementing the ob-standard lazy
> evaluation mechanism, and integrate it the [Begin|End]Modify() process.
>
> We can do a git pull and then fix and adapt it according to the feedback
> other developers suggestions.
>
> This would be a great chance for us to understand how to contribute code
> that would integrate better and match the guidelines you guys follow.
>
> Thanks (for the patience),
>
> S
>
>
>
>
>
> On 02/26/2017 02:05 AM, Noel O'Boyle wrote:
>>
>> Thanks Stefano for not getting (too!) pissed off with me. :-) We still
>> don't have the clear API guidelines you asked for last time, but I
>> think that these discussions are clarifying things for me at least,
>> and we could probably start writing something up.
>>
>> I was thinking that your idea is similar to the rationale behind
>> OBStereoFacade. Well, simply put, we could have an OBResidueFacade
>> class which you initialise with a molecule and behind the scenes it
>> then stores the Atom->PDBAtomId correspondance in a map or vector by
>> iterating over the residues. Then you would have the same method you
>> described, but it wouldn't do any iteration, but just look up the Id
>> in the std::map or vector. So, it's a convenience class, separate from
>> the core API, and it's efficient. If no-one disagrees and this makes
>> sense to you, do you want to have a go at writing it?
>>
>> - Noel
>>
>> On 25 February 2017 at 23:03, Geoffrey Hutchison
>>  wrote:

 About the PDB atom name, unfortunately I don't fully understand the
 performance issue implied in my suggestion, but from an interface point of
 view, it seems more intuitive to access an atom property from OBAtom 
 instead
 of going back to the OBResidue (and pass the OBAtom).
>>>
>>>
>>> The OBAtom should already have a pointer to the OBResidue. If it can be
>>> done with generic data in the OBAtom, that's fine, but I don't think I want
>>> to add data to each OBAtom. If I translate a nanotube or nanoparticle (or
>>> any other non-biomolecule) I'd have to store in memory a bunch of residue
>>> pointers that would never get used.
>>>
 In this case, the best approach I can think of is to write a
 OBHBondTyper class to perceive the hbond character (similarly to what
 OBAromaticTyper does), then each atom should have a simple IsHBond() method
 that would return 0, 1 (donor), 2 (acceptor), 3(donor/acceptor), 4(...?).
>>>
>>>
>>> Yes, this is the way to go to add "convenience functions." Noel's
>>> suggestion is to keep the core API restrained, but that doesn't mean there
>>> can't be convenience classes or static methods. I think HBondTyper would be
>>> a good example, since there can (and should) be multiple models of "what is
>>> a hydrogen bond."
>>>
 At the same time, I support Maciek's idea: useful functions that can
 streamline the use of scripting languages such as Python should be grouped
 and conserved.
>>>
>>>
>>> I'm not sure why these can't be put into wrappers like Pybel. As is,
>>> there is considerable "helper code" in Pybel, etc.
>>>
>>> -Geoff
>
>
> --
>
>  Stefano Forli, PhD
>
>  Assistant 

Re: [OpenBabel-Devel] Convenience functions

2017-03-01 Thread Stefano Forli
Noel,
quite the contrary, I'm far from being pissed at you, by all means.
I like your suggestion, but I don't know if I can do it right away, there are 
still a few 
things about the facade programming paradigm that escape my hobbist programming 
training.

Following up on the discussion about the hydrogen bond, I had a quick chat with 
one of my 
students which is starting to write code based on OpenBabel.
We took a shot at designing an OBHBondTyper class which should behave similarly 
to 
OBAromaticTyper, and my idea was to store the information in a vector.

If I'm not mistaken, though, the aromatic typer works in a lazy way that looks 
similar to 
what you're describing for the OBResidueFacade, storing the information in a 
flag instead 
of a vector, is that correct? Or is the flag is just in vector? I tried looking 
for the 
definition of HasFlag(), but I couldn't find it.

Either way, I was thinking to start by writing this HB class (which I probably 
understand 
better), try implementing the ob-standard lazy evaluation mechanism, and 
integrate it the 
[Begin|End]Modify() process.

We can do a git pull and then fix and adapt it according to the feedback other 
developers 
suggestions.

This would be a great chance for us to understand how to contribute code that 
would 
integrate better and match the guidelines you guys follow.

Thanks (for the patience),

S




On 02/26/2017 02:05 AM, Noel O'Boyle wrote:
> Thanks Stefano for not getting (too!) pissed off with me. :-) We still
> don't have the clear API guidelines you asked for last time, but I
> think that these discussions are clarifying things for me at least,
> and we could probably start writing something up.
>
> I was thinking that your idea is similar to the rationale behind
> OBStereoFacade. Well, simply put, we could have an OBResidueFacade
> class which you initialise with a molecule and behind the scenes it
> then stores the Atom->PDBAtomId correspondance in a map or vector by
> iterating over the residues. Then you would have the same method you
> described, but it wouldn't do any iteration, but just look up the Id
> in the std::map or vector. So, it's a convenience class, separate from
> the core API, and it's efficient. If no-one disagrees and this makes
> sense to you, do you want to have a go at writing it?
>
> - Noel
>
> On 25 February 2017 at 23:03, Geoffrey Hutchison
>  wrote:
>>> About the PDB atom name, unfortunately I don't fully understand the 
>>> performance issue implied in my suggestion, but from an interface point of 
>>> view, it seems more intuitive to access an atom property from OBAtom 
>>> instead of going back to the OBResidue (and pass the OBAtom).
>>
>> The OBAtom should already have a pointer to the OBResidue. If it can be done 
>> with generic data in the OBAtom, that's fine, but I don't think I want to 
>> add data to each OBAtom. If I translate a nanotube or nanoparticle (or any 
>> other non-biomolecule) I'd have to store in memory a bunch of residue 
>> pointers that would never get used.
>>
>>> In this case, the best approach I can think of is to write a OBHBondTyper 
>>> class to perceive the hbond character (similarly to what OBAromaticTyper 
>>> does), then each atom should have a simple IsHBond() method that would 
>>> return 0, 1 (donor), 2 (acceptor), 3(donor/acceptor), 4(...?).
>>
>> Yes, this is the way to go to add "convenience functions." Noel's suggestion 
>> is to keep the core API restrained, but that doesn't mean there can't be 
>> convenience classes or static methods. I think HBondTyper would be a good 
>> example, since there can (and should) be multiple models of "what is a 
>> hydrogen bond."
>>
>>> At the same time, I support Maciek's idea: useful functions that can 
>>> streamline the use of scripting languages such as Python should be grouped 
>>> and conserved.
>>
>> I'm not sure why these can't be put into wrappers like Pybel. As is, there 
>> is considerable "helper code" in Pybel, etc.
>>
>> -Geoff

-- 

  Stefano Forli, PhD

  Assistant Professor of ISCB
  Molecular Graphics Laboratory

  Dept. of Integrative Structural
  and Computational Biology, MB-112A
  The Scripps Research Institute
  10550  North Torrey Pines Road
  La Jolla,  CA 92037-1000,  USA.

 tel: +1 (858)784-2055
 fax: +1 (858)784-2860
 email: fo...@scripps.edu
 http://www.scripps.edu/~forli/

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Convenience functions

2017-02-25 Thread Geoffrey Hutchison
> About the PDB atom name, unfortunately I don't fully understand the 
> performance issue implied in my suggestion, but from an interface point of 
> view, it seems more intuitive to access an atom property from OBAtom instead 
> of going back to the OBResidue (and pass the OBAtom).

The OBAtom should already have a pointer to the OBResidue. If it can be done 
with generic data in the OBAtom, that's fine, but I don't think I want to add 
data to each OBAtom. If I translate a nanotube or nanoparticle (or any other 
non-biomolecule) I'd have to store in memory a bunch of residue pointers that 
would never get used.

> In this case, the best approach I can think of is to write a OBHBondTyper 
> class to perceive the hbond character (similarly to what OBAromaticTyper 
> does), then each atom should have a simple IsHBond() method that would return 
> 0, 1 (donor), 2 (acceptor), 3(donor/acceptor), 4(...?).

Yes, this is the way to go to add "convenience functions." Noel's suggestion is 
to keep the core API restrained, but that doesn't mean there can't be 
convenience classes or static methods. I think HBondTyper would be a good 
example, since there can (and should) be multiple models of "what is a hydrogen 
bond."

> At the same time, I support Maciek's idea: useful functions that can 
> streamline the use of scripting languages such as Python should be grouped 
> and conserved.

I'm not sure why these can't be put into wrappers like Pybel. As is, there is 
considerable "helper code" in Pybel, etc.

-Geoff
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Convenience functions

2017-02-25 Thread Stefano Forli
Hi Noel,

I'm in, too.
Spotting these convoluted approaches may be a great opportunity to identify 
places where the API should be streamlined.
About the PDB atom name, unfortunately I don't fully understand the performance 
issue implied in my suggestion, but from an interface point of view, it seems 
more intuitive to access an atom property from OBAtom instead of going back to 
the OBResidue (and pass the OBAtom).
That would address the issue of the performance as well, right?

To use another example where my convoluted approach raised the issue [a good 
thing, I guess, one way or another], functions like  OBAtom::IsHBAcceptor() 
OBAtom::IsHBDonor() are big and convoluted convenience functions, because to do 
the proper classification, there are several cases that need to be tested.
In this case, the best approach I can think of is to write a OBHBondTyper class 
to perceive the hbond character (similarly to what OBAromaticTyper does), then 
each atom should have a simple IsHBond() method that would return 0, 1 (donor), 
2 (acceptor), 3(donor/acceptor), 4(...?).

At the same time, I support Maciek's idea: useful functions that can streamline 
the use of scripting languages such as Python should be grouped and conserved.

S


--

 Stefano Forli, PhD

 Assistant Professor of ISCB
 Molecular Graphics Laboratory

 Dept. of Integrative Structural
 and Computational Biology, MB-112A
 The Scripps Research Institute
 10550  North Torrey Pines Road
 La Jolla,  CA 92037-1000,  USA.

tel: +1 (858)784-2055
fax: +1 (858)784-2860
email: fo...@scripps.edu
http://www.scripps.edu/~forli/


From: Maciek Wójcikowski [mac...@wojcikowski.pl]
Sent: Saturday, February 25, 2017 3:21 AM
To: Noel O'Boyle
Cc: Openbabel-DEV
Subject: Re: [OpenBabel-Devel] Convenience functions

Hi Noel,

I agree with you and must say that in 99.99% of cases it's the right way to do. 
Although I still believe the convenient functions, especially the ones that 
loop over stuff, are extremely useful when called from Python - they are just 
so much faster. I'd be happy with some external place where we could keep them.

+1 from me.


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl<mailto:mac...@wojcikowski.pl>

2017-02-25 10:33 GMT+01:00 Noel O'Boyle 
<baoille...@gmail.com<mailto:baoille...@gmail.com>>:
Hi there,

As some of you know, I would like to remove all convenience functions
from classes in Open Babel. I would like to explain why.

It's hard to exactly define a convenience function, but it's an
addition to the API that is implemented entirely using existing API
calls  and that makes it easier to do certain things. There may be a
place for these in the documentation (e.g. usage examples as OEChem
does) or in a convenience.cpp, but I argue that they should never be
part of any Open Babel classes, and in particular, should never be
used internally by the API.

One reason is that it clogs up the core API, which affects both
compilation speed and reading the API looking for functions. It's also
API duplication - there should be one way to do something. Also, there
are a infinite number of convenience functions and once you let one in
(e.g. OBAtom::IsCarbon()), you must then let in 100+ other ones.

A more subtle reason is that they obscure the correct usage of the
underlying core API, as users cannot know they are convenience
functions. Sounds a bit vague? Here are two examples.

Let's take the duplicated std::string versus const char* functions I
mentioned in an earlier email. The std::string version just calls the
const char* function. Some of these convenience functions do an
unnecessary string copy. On top of this, a user of the toolkit, having
a char* to hand but needing to choose between the methods, may
construct a std::string (another copy) to call the std::string method
because stack overflow tells us to always use C++ STL objects. So,
adding the convenience functions had the unintended consequence of two
string copies.

Another example is the convenience function that Stefano has proposed
that loops over the Residues to return the PDBAtomId for an OBAtom. I
don't disagree this is useful, but it's still a convenience function.
Now imagine a user who writes a loop over all the atoms in a molecule
calling this function. They end up using an N squared function, which
is going to be a fairly significant slowdown for PDB files. But there
was no way for them to know that this was not the correct method to
call.

The current API has many such convenience functions. Let's make a
bonfire (e.g. called convenience.cpp) and burn them all.

- Noel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lis

[OpenBabel-Devel] Convenience functions

2017-02-25 Thread Noel O'Boyle
Hi there,

As some of you know, I would like to remove all convenience functions
from classes in Open Babel. I would like to explain why.

It's hard to exactly define a convenience function, but it's an
addition to the API that is implemented entirely using existing API
calls  and that makes it easier to do certain things. There may be a
place for these in the documentation (e.g. usage examples as OEChem
does) or in a convenience.cpp, but I argue that they should never be
part of any Open Babel classes, and in particular, should never be
used internally by the API.

One reason is that it clogs up the core API, which affects both
compilation speed and reading the API looking for functions. It's also
API duplication - there should be one way to do something. Also, there
are a infinite number of convenience functions and once you let one in
(e.g. OBAtom::IsCarbon()), you must then let in 100+ other ones.

A more subtle reason is that they obscure the correct usage of the
underlying core API, as users cannot know they are convenience
functions. Sounds a bit vague? Here are two examples.

Let's take the duplicated std::string versus const char* functions I
mentioned in an earlier email. The std::string version just calls the
const char* function. Some of these convenience functions do an
unnecessary string copy. On top of this, a user of the toolkit, having
a char* to hand but needing to choose between the methods, may
construct a std::string (another copy) to call the std::string method
because stack overflow tells us to always use C++ STL objects. So,
adding the convenience functions had the unintended consequence of two
string copies.

Another example is the convenience function that Stefano has proposed
that loops over the Residues to return the PDBAtomId for an OBAtom. I
don't disagree this is useful, but it's still a convenience function.
Now imagine a user who writes a loop over all the atoms in a molecule
calling this function. They end up using an N squared function, which
is going to be a fairly significant slowdown for PDB files. But there
was no way for them to know that this was not the correct method to
call.

The current API has many such convenience functions. Let's make a
bonfire (e.g. called convenience.cpp) and burn them all.

- Noel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel