Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-14 Thread Alexandre Bergel via Pharo-users
--- Begin Message ---
Thanks for your detailed answers

Cheers,
Alexandre

> On Jul 12, 2019, at 8:49 AM, Richard O'Keefe  wrote:
> 
> Search trees (which come in many varieties) are good for collections in which
> adding, deleting, and lookup are interleaved, where there is an order on the
> elements (if used as a set) or the keys (if used as a dictionary).  So are
> hash tables.  But (balanced) search trees have guaranteed O(lg n) worst case
> time, which hash tables do not.  And search trees let you enumerate their
> contents in ascending or descending order, at any time.  And search trees can
> support queries like
>  - tell me all the elements {<, <=, >, >=} x
>  - tell me all the elements between L and U
>  - tell me the next element (before,after) x
> 
> My Smalltalk library includes Sorted{Set,Bag,Dictionary} implemented as splay 
> trees.
> Some time I must try an alternative.
> in time O(lg n + number of answers).
> 
> On Thu, 11 Jul 2019 at 04:53, Alexandre Bergel via Pharo-users 
> mailto:pharo-users@lists.pharo.org>> wrote:
> Hello Smiljana,
> 
> Thanks for having written down this document. I am not expert in algorithm, 
> so I would consider myself a simple user. I have developed complex software 
> for some times and I have never seen the need of having a binary search tree. 
> I guess this is probably partly because of my lack of expertise in binary 
> search tree and partly because experts in binary search trees assume that 
> people know what it is about and in what it is useful.
> 
> My question is, when should a programmer ever need to use binary search tree? 
> Can you add some examples on what these trees are good for, and how an 
> average programmer should look into it. I think this will be a valuable and 
> easy way to expand your blog.
> 
> Cheers,
> Alexandre
> 
>> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev > > wrote:
>> 
>> Dear all,
>> 
>> I've written about implementing binary search trees: 
>> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>>  
>> 
>> 
>> Feedback and opinions is always welcome :)
>> 
>> Best regards,
>> Smiljana Knezev
> 

--- End Message ---


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-12 Thread Richard O'Keefe
Search trees (which come in many varieties) are good for collections in
which
adding, deleting, and lookup are interleaved, where there is an order on the
elements (if used as a set) or the keys (if used as a dictionary).  So are
hash tables.  But (balanced) search trees have guaranteed O(lg n) worst case
time, which hash tables do not.  And search trees let you enumerate their
contents in ascending or descending order, at any time.  And search trees
can
support queries like
 - tell me all the elements {<, <=, >, >=} x
 - tell me all the elements between L and U
 - tell me the next element (before,after) x

My Smalltalk library includes Sorted{Set,Bag,Dictionary} implemented as
splay trees.
Some time I must try an alternative.
in time O(lg n + number of answers).

On Thu, 11 Jul 2019 at 04:53, Alexandre Bergel via Pharo-users <
pharo-users@lists.pharo.org> wrote:

> Hello Smiljana,
>
> Thanks for having written down this document. I am not expert in
> algorithm, so I would consider myself a simple user. I have developed
> complex software for some times and I have never seen the need of having a
> binary search tree. I guess this is probably partly because of my lack of
> expertise in binary search tree and partly because experts in binary search
> trees assume that people know what it is about and in what it is useful.
>
> My question is, when should a programmer ever need to use binary search
> tree? Can you add some examples on what these trees are good for, and how
> an average programmer should look into it. I think this will be a valuable
> and easy way to expand your blog.
>
> Cheers,
> Alexandre
>
> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev 
> wrote:
>
> Dear all,
>
> I've written about implementing binary search trees:
> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>
> Feedback and opinions is always welcome :)
>
> Best regards,
> Smiljana Knezev
>
>
>


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Ben Coman
On Thu, 11 Jul 2019 at 00:53, Alexandre Bergel via Pharo-users
 wrote:
>
> Hello Smiljana,
>
> Thanks for having written down this document. I am not expert in algorithm, 
> so I would consider myself a simple user. I have developed complex software 
> for some times and I have never seen the need of having a binary search tree. 
> I guess this is probably partly because of my lack of expertise in binary 
> search tree and partly because experts in binary search trees assume that 
> people know what it is about and in what it is useful.
>
> My question is, when should a programmer ever need to use binary search tree? 
> Can you add some examples on what these trees are good for, and how an 
> average programmer should look into it. I think this will be a valuable and 
> easy way to expand your blog.

My "computer science" is a bit rusty but Binary Search Trees and their
generalisation to B-Trees is Databases 101.
Almost guaranteed you are "using" them with any database you use,
you're just not directly exposed to them.

If you skim "Redesigning String Data Structures to Exploit Cache"
(https://people.eng.unimelb.edu.au/jzobel/fulltext/acmjea10.pdf)
looking at the performance graphs, for "unordered searching" the best
bet for many applications is probably our hash-based Dictionary if
that covers all your needs.
BSTs are more memory efficient than Dictionarys, but less cache-friendly.

A few applications where BSTs can be useful:
* doing range searches efficiently [1]
* traversing elements in order, which is more difficult with hash tables [1] [2]
* good for "order" statistics [2] like:
- inexact search - finding closest lower and greater elements.
   - determining maximum & minimum elements
   - determining successor & predecessor elements
   - range queries - easy to do with BSTs and not a natural operation
with Hash Tables.
* Self-Balancing BSTs guarantee operations work in O(Logn) time.
Hashing has Θ(1) is average time but some particular operations are
costly, especially when table resizing happen [2]
   but also they are slower so if the data is sufficiently random
simple BSTs may do.

[1] 
https://stackoverflow.com/questions/371136/binary-trees-vs-linked-lists-vs-hash-tables
[2] https://www.geeksforgeeks.org/advantages-of-bst-over-hash-table/

This side-related article...
   "Number crunching: Why you should never, ever, EVER use linked-list
in your code again"
   
(https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again)
indicates that for many In-Memory operations you are better using
sorted-Arrays (e.g. SequenceableCollection>>#findBinary:)
than cache-invalidating pointer-base structures like Lists & Trees -
thus the strong association of BSTs & BTrees with slower access
disk-based databases.
Also notice that additional BST features closely match common database
operations.

cheers -ben



Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Richard Sargent
On Wed, Jul 10, 2019 at 9:53 AM Alexandre Bergel via Pharo-users <
pharo-users@lists.pharo.org> wrote:

> Hello Smiljana,
>
> Thanks for having written down this document. I am not expert in
> algorithm, so I would consider myself a simple user. I have developed
> complex software for some times and I have never seen the need of having a
> binary search tree. I guess this is probably partly because of my lack of
> expertise in binary search tree and partly because experts in binary search
> trees assume that people know what it is about and in what it is useful.
>

Hi Alexandre,

I can tell you that GemStone/S makes extensive use of b-trees and variants
in its internal data structures. Two examples are the object table itself
and the index data structures. I agree that the average programmer may
never have a need for them, but I don't like to close doors on (other)
developers.

Smiljana,

Others have suggested things like the Null Object pattern for the left and
right links, but an alternative or supplement to that would be a space
optimizing leaf node which doesn't have instance variables for left and
right sub-trees. But, don't go down the optimizing paths until you have
good solutions for everything else. i.e. make right, then make it fast.


I noted your code looks "C-like", which I suspect arises from being
inspired by implementation in other languages.

For example, I think only an empty tree should answer 0 for depth. A tree
with one node should answer 1 while a tree with 2 or 3 nodes should answer
2, etc. If you drew a tree out on paper, how many levels would you see?
That is the depth, in my opinion.

Another example arises from one of the most powerful idioms in Smalltalk.
Call it the #at:ifAbsent: pattern. (See also #detect:ifNone:) Someone else
mentioned the passing around of nil and suggested you not do that. The
#at:ifAbsent: pattern is the key to avoiding that: tell the receiver what
*you* want to *do* if the thing can't be found by passing in your own block.


Additionally, you will want your nodes to support various payloads. Right
now the payload is the object held in the key. That works for identity
structures, but it won't work for more advanced needs, such as one would
find in a database index structure. Eventually, you will want the ability
to have nodes that support additional attribute(s). I've found P. J
Plauger's "0, 1, infinity" rule to be a superb design guideline. In short,
once you go beyond having one of something (e.g. the key itself), you need
to design things so that you can have an arbitrarily large number. The step
from 1 to >1 is huge and messy if retrofitted, but easy when planned in.

Eventually, you will need queries against the tree, allowing range
selection, identity versus equality, etc. The tree should provide a basic
infrastructure for searching, but a separate modelling of queries will be
necessary. That may be out of scope for GSOC, of course.


Good work and good luck with the rest of the project!



> My question is, when should a programmer ever need to use binary search
> tree? Can you add some examples on what these trees are good for, and how
> an average programmer should look into it. I think this will be a valuable
> and easy way to expand your blog.
>
> Cheers,
> Alexandre
>
> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev 
> wrote:
>
> Dear all,
>
> I've written about implementing binary search trees:
> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>
> Feedback and opinions is always welcome :)
>
> Best regards,
> Smiljana Knezev
>
>
>


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Sven Van Caekenberghe
Binary search is extremely useful to quickly find something in a sorted 
collection, like finding if a string or number is included in a large 
collection.

BTW, we already have an implementation, see 
SequenceableCollection>>#findBinary: and friends. 

> On 10 Jul 2019, at 18:53, Alexandre Bergel via Pharo-users 
>  wrote:
> 
> 
> From: Alexandre Bergel 
> Subject: Re: [Pharo-users] [GSoC blog post] Binary Search Trees
> Date: 10 July 2019 at 18:53:01 GMT+2
> To: Any question about pharo is welcome 
> 
> 
> Hello Smiljana,
> 
> Thanks for having written down this document. I am not expert in algorithm, 
> so I would consider myself a simple user. I have developed complex software 
> for some times and I have never seen the need of having a binary search tree. 
> I guess this is probably partly because of my lack of expertise in binary 
> search tree and partly because experts in binary search trees assume that 
> people know what it is about and in what it is useful.
> 
> My question is, when should a programmer ever need to use binary search tree? 
> Can you add some examples on what these trees are good for, and how an 
> average programmer should look into it. I think this will be a valuable and 
> easy way to expand your blog.
> 
> Cheers,
> Alexandre
> 
>> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev  wrote:
>> 
>> Dear all,
>> 
>> I've written about implementing binary search trees: 
>> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>> 
>> Feedback and opinions is always welcome :)
>> 
>> Best regards,
>> Smiljana Knezev
> 
> 
> 




Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Alexandre Bergel via Pharo-users
--- Begin Message ---
Hello Smiljana,

Thanks for having written down this document. I am not expert in algorithm, so 
I would consider myself a simple user. I have developed complex software for 
some times and I have never seen the need of having a binary search tree. I 
guess this is probably partly because of my lack of expertise in binary search 
tree and partly because experts in binary search trees assume that people know 
what it is about and in what it is useful.

My question is, when should a programmer ever need to use binary search tree? 
Can you add some examples on what these trees are good for, and how an average 
programmer should look into it. I think this will be a valuable and easy way to 
expand your blog.

Cheers,
Alexandre

> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev  wrote:
> 
> Dear all,
> 
> I've written about implementing binary search trees: 
> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>  
> 
> 
> Feedback and opinions is always welcome :)
> 
> Best regards,
> Smiljana Knezev

--- End Message ---


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Smiljana Knezev
Thank you all very much. This is quite useful for me and I can learn a lot.

best regards,
Smiljana Knezev

сре, 10. јул 2019. у 12:29 Ben Coman  је написао/ла:

> I agree #bfs: is not explicit, but  #breadthFirstSearch:  is a well
> know term, why not just that? Or even #breathFirstSearchFor:
>
> In comparison  #findBreadthFirst:  and  #findDeepFirst:  feel a bit
> awkward to me.
> (I think its the repetition rhythm of "f" words)
>
> cheers -ben
>
> On Mon, 8 Jul 2019 at 12:55, ducasse  wrote:
> >
> > bfs: aValue
> >
> > =>
> >
> > findBreadthFirst: aValue
> >
> >
> > dfs: info node: aNode
> >
> > =>
> >
> > findDeepFirst: aValue startingFrom: aNode
>
>


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread Ben Coman
I agree #bfs: is not explicit, but  #breadthFirstSearch:  is a well
know term, why not just that? Or even #breathFirstSearchFor:

In comparison  #findBreadthFirst:  and  #findDeepFirst:  feel a bit
awkward to me.
(I think its the repetition rhythm of "f" words)

cheers -ben

On Mon, 8 Jul 2019 at 12:55, ducasse  wrote:
>
> bfs: aValue
>
> =>
>
> findBreadthFirst: aValue
>
>
> dfs: info node: aNode
>
> =>
>
> findDeepFirst: aValue startingFrom: aNode



Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-10 Thread webwarrior
Smiljana Knezev wrote
> Dear all,
> 
> I've written about implementing binary search trees:
> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
> 
> Feedback and opinions is always welcome :)
> 
> Best regards,
> Smiljana Knezev

Your implementation is unnecesarily complicated. 

Remember, tree is a recursive data structure. You may have have reasons to
split tree and node into different classes, but at least delegate operations
to node.

For example, #depth and #size can be defined as following methods of Node
class:

depth
self isLeaf ifTrue: [ ^ 0 ].
^ leftChild depth max: rightChild depth

size
self isLeaf ifTrue: [ ^ 1 ].
^ leftChild size + rightChlid size

Most other operations can also be defined in similar fashion.




--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html



Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-07 Thread ducasse
bfs: aValue

=>

findBreadthFirst: aValue


dfs: info node: aNode

=> 

findDeepFirst: aValue startingFrom: aNode 

Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-07 Thread ducasse


> On 8 Jul 2019, at 05:57, ducasse  wrote:
> 
> More feedback :)
> 
> Do not return nil 
> Check the last lecture of the mooc
> 
> Did you read Pharo with Style?
> 
> I do not know if LinkedList is the one of the system used by Process but in 
> that case use the alternate implementation. 


Sorry I’ still jet lagged and sleepy
But

- bfs: is a bad name. 
- Also why not having a block to specify what we are looking for?



> 
> 
> bfs: info
>   "Breadth fisrt search for an information. If there is no key containing 
> given infomration returns nil. If the info is found returns a node containing 
> it"
>   | queue |
>   self isEmpty 
>   ifTrue: [ ^nil ].
>   queue := LinkedList new.
>   queue addLast: self root.
>   
>   [ queue isNotEmpty ] whileTrue: [ 
>   | node |
>   node := queue removeFirst.
>   node key == info  
>   ifTrue: [ ^node ].
>   node leftChild isNotNil 
>   ifTrue: [ queue addLast: node leftChild ].
>   node rightChild isNotNil
>   ifTrue: [ queue addLast: node rightChild ].
>].
>   ^nil.
> 
> => 
> 
> bfs: info
>   "Breadth fisrt search for an information. If there is no key containing 
> given infomration returns nil. If the info is found returns a node containing 
> it"
>   | queue |
>   self isEmpty ifTrue: [ ^ self ].
>   queue := LinkedList new.
>   queue addLast: self root.
>   
>   [ queue isNotEmpty ] whileTrue: [ 
>   | node |
>   node := queue removeFirst.
>   node key == info  
>   ifTrue: [ ^node ].
>   node leftChild isNotNil 
>   ifTrue: [ queue addLast: node leftChild ].
>   node rightChild isNotNil
>   ifTrue: [ queue addLast: node rightChild ].
>]



Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-07 Thread ducasse
More feedback :)

Do not return nil 
Check the last lecture of the mooc

Did you read Pharo with Style?

I do not know if LinkedList is the one of the system used by Process but in 
that case use the alternate implementation. 


bfs: info
"Breadth fisrt search for an information. If there is no key containing 
given infomration returns nil. If the info is found returns a node containing 
it"
| queue |
self isEmpty 
ifTrue: [ ^nil ].
queue := LinkedList new.
queue addLast: self root.

[ queue isNotEmpty ] whileTrue: [ 
| node |
node := queue removeFirst.
node key == info  
ifTrue: [ ^node ].
node leftChild isNotNil 
ifTrue: [ queue addLast: node leftChild ].
node rightChild isNotNil
ifTrue: [ queue addLast: node rightChild ].
 ].
^nil.

=> 

bfs: info
"Breadth fisrt search for an information. If there is no key containing 
given infomration returns nil. If the info is found returns a node containing 
it"
| queue |
self isEmpty ifTrue: [ ^ self ].
queue := LinkedList new.
queue addLast: self root.

[ queue isNotEmpty ] whileTrue: [ 
| node |
node := queue removeFirst.
node key == info  
ifTrue: [ ^node ].
node leftChild isNotNil 
ifTrue: [ queue addLast: node leftChild ].
node rightChild isNotNil
ifTrue: [ queue addLast: node rightChild ].
 ]


Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-07 Thread ducasse


> On 8 Jul 2019, at 05:51, ducasse  wrote:
> 
> 
> 
>> On 7 Jul 2019, at 23:12, Smiljana Knezev > > wrote:
>> 
>> Dear all,
>> 
>> I've written about implementing binary search trees: 
>> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>>  
>> 
>> 
>> Feedback and opinions is always welcome :)
> 
> pay attention to the formatting of your code. 
> It makes is difficult to read (I do not know if this is wordpress effect or 
> not). 
> 
> Once you have your tree implementation and tests I would really like to see 
> if you can get another implementation and measure
>  the impact of introducing a NullNode to remove all the isNil checks. 


Also no need of extra parentheses between binary and keywords
depth:aNode
"Returns depth of a tree starting from the given node"
| leftDepth rightDepth |
leftDepth := -1.
aNode leftChild isNotNil 
ifTrue: [ leftDepth := self depth: aNode leftChild ].
rightDepth := -1.
aNode rightChild isNotNil 
ifTrue: [ rightDepth := self depth: aNode rightChild ].

( leftDepth > rightDepth )
ifTrue: [ ^ (1 + leftDepth) ]
ifFalse: [^ (1 + rightDepth ) ].

=>


 leftDepth > rightDepth 
ifTrue: [ ^ 1 + leftDepth ]
ifFalse: [^ 1 + rightDepth ].

And even better move out of block returns
when possible

^  leftDepth > rightDepth 
ifTrue: [ 1 + leftDepth ]
ifFalse: [ 1 + rightDepth ].


With a nullNode the code should look like

depth:aNode
"Returns depth of a tree starting from the given node”

| leftDepth rightDepth |
leftDepth := self depth: aNode leftChild.
rightDepth := self depth: aNode rightChild.
^ leftDepth > rightDepth
ifTrue: [ 1 + leftDepth ]
ifFalse: [1 + rightDepth ]


> 
> 
>> 
>> Best regards,
>> Smiljana Knezev
> 



Re: [Pharo-users] [GSoC blog post] Binary Search Trees

2019-07-07 Thread ducasse


> On 7 Jul 2019, at 23:12, Smiljana Knezev  wrote:
> 
> Dear all,
> 
> I've written about implementing binary search trees: 
> https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html
>  
> 
> 
> Feedback and opinions is always welcome :)

pay attention to the formatting of your code. 
It makes is difficult to read (I do not know if this is wordpress effect or 
not). 

Once you have your tree implementation and tests I would really like to see if 
you can get another implementation and measure
 the impact of introducing a NullNode to remove all the isNil checks. 


> 
> Best regards,
> Smiljana Knezev