Re: [Pharo-project] About linkedlist

2012-02-06 Thread Frank Shearar
On 5 February 2012 10:30, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 Thanks for the explanation frank
 btw (did you dd it in the class comment because it would be gorgeous).
 I learned something today so I'm happy.
 Now what is the typical use case for such persistent structure?

Indeed I made sure to write a decent class comment :)

The use case is twofold:
* any time you want to stop caring about whether you need to copy a
collection or can safely pass it around (so avoiding the foo copy
addAll: bar; yourself idiom)
* any time you want to be able to undo changes to some structure.

In particular, I wanted to add an or unifier to my unification
library. (Unification is like bidirectional pattern matching. In
particular, we build up an equivalence relation over the nodes in two
structures. After that, if any partitions in the relation contain
variables, we can construct a mapping from variables to concrete
values, to give a most general unifier.) This would let me say can X
or Y unify against this thing? -

(Node left: (Leaf value: #x asVariable)) or: (Node right: (Leaf value:
#x asVariable)) =? someRandomTree

will
* only do something useful if someRandomTree has exactly one child Leaf, and
* will tell us what the value of that Leaf is, by telling us how to
map from the variable called #x to that value.

What I quickly found was that the unification algorithm - which used a
normal, non-persistent (ephemeral) union-find - kept its state after
attempting to unify the first disjunction clause. That meant that
unifying the clause would sometimes fail. What I actually wanted was
to be able to say

* given some partial term relation U,
* try unify the first clause to get a new term relation U1
* but if unification fails, start back with U and try calculate a new
term relation U2 on the second clause's nodes

Using a _persistent_ union-find lets me thus roll back or undo the
term relation construction and revert to an older version.

frank

 Stef

 On Feb 4, 2012, at 3:33 PM, Frank Shearar wrote:

 On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote:


 I wasn't sure what double-ended versus single-ended meant, and found some
 answer at the obvious place [1]
 which mentions Ada.Containers.Vectors is a dynamic array implementation
 which seems consistent with C++ [2].
 While looking around I happened to bump into [3] which was too much for me
 but may be of interest to anyone playing with data structres.  Skimming 
 this
 shows it discussing efficiency of data structures in functional languages
 together with lazy variants of imperative language data structures.  Would
 such apply to Smalltalk or is the object-oriented style of Smalltalk
 considered imperative rather than functional?

 The standard tool set - OrderedCollection, Array, etc., are not
 functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd
 put them firmly on the imperative side of the fence.

 There's nothing stopping one from writing functional data structures -
 Levente Uzonyi's written some persistent data structures, as have I.
 (Persistent means you get new versions of a data structure; if you
 hang onto those old versions you can roll back your changes.)

 Can you explain a bit more Persistent?

 Sure! A persistent data structure is an _apparently_ immutable data
 structure: editing the structure returns you a _new_ structure. If you
 can modify only the latest version of the structure, it's
 _partially_persistent_; if you can edit any version it is
 _fully_persistent_. Of course edit here means get a new collection
 representing the edit

 For example, if you load the PersistentUnionFind package (no external
 dependencies) and print out this:

 | p p1 p2 p3 |
 p := PersistentCollection initially: #(1 2 3).
 p1 := p at: 1 put: 4.
 p2 := p1 at: 2 put: 5.
 p3 := p at: 3 put: 0.
 {p. p1. p2. p3.} collect: #asArray

 you get, with inline comments added:

   #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0))

 So you can see the different versions of the original array: p1 is p
 plus one replacement; p2 is p plus two replacements (or equivalently
 p1 plus one replacement).

 You still get efficient access to the latest version, and the old
 versions describe themselves as a stack of undo operations - I'm like
 the latest version, but with these changes.

 So if you just print out p1, you'll see:

 p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))'

 The series of Diffs tell you how to get from the latest version - p3 -
 to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and
 at index 1 put a 1.

 A Ref is a delegate: it allows your reference to something to stay the
 same even though the thing referenced changes.

 Internally, as you access the latest version of the Collection the
 structure rewrites itself: if it's pointing to a Diff it reroots
 itself. (This does have the issue that if you hang onto two versions
 of the collection and access each version 

Re: [Pharo-project] About linkedlist

2012-02-06 Thread Nick Ager
Hi Stef,

Thanks for the explanation frank
 btw (did you dd it in the class comment because it would be gorgeous).
 I learned something today so I'm happy.
 Now what is the typical use case for such persistent structure?

 Stef


Immutable data structures are used extensively in functional languages in
which data structures are *im*mutable by default. One benefit of the
approach is when multi-threading you don't need to synchronise changes to
such structures or objects containing them. For example all Clojure's
data-structures are immutable by default [1]. There's been a lot of
interesting work on providing performance guarantees so that creating
copies avoids scaling with linear time [2], while still having sensible
access and iteration performance.

[1] http://clojure.org/data_structures#Data%20Structures-Collections
[2]
http://clojure.org/functional_programming#Functional%20Programming--Immutable%20Data%20Structures

Apologies if I'm stating the obvious

Nick


Re: [Pharo-project] About linkedlist

2012-02-05 Thread Stéphane Ducasse
Thanks for the explanation frank 
btw (did you dd it in the class comment because it would be gorgeous).
I learned something today so I'm happy.
Now what is the typical use case for such persistent structure?

Stef

On Feb 4, 2012, at 3:33 PM, Frank Shearar wrote:

 On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 
 
 I wasn't sure what double-ended versus single-ended meant, and found some
 answer at the obvious place [1]
 which mentions Ada.Containers.Vectors is a dynamic array implementation
 which seems consistent with C++ [2].
 While looking around I happened to bump into [3] which was too much for me
 but may be of interest to anyone playing with data structres.  Skimming 
 this
 shows it discussing efficiency of data structures in functional languages
 together with lazy variants of imperative language data structures.  Would
 such apply to Smalltalk or is the object-oriented style of Smalltalk
 considered imperative rather than functional?
 
 The standard tool set - OrderedCollection, Array, etc., are not
 functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd
 put them firmly on the imperative side of the fence.
 
 There's nothing stopping one from writing functional data structures -
 Levente Uzonyi's written some persistent data structures, as have I.
 (Persistent means you get new versions of a data structure; if you
 hang onto those old versions you can roll back your changes.)
 
 Can you explain a bit more Persistent?
 
 Sure! A persistent data structure is an _apparently_ immutable data
 structure: editing the structure returns you a _new_ structure. If you
 can modify only the latest version of the structure, it's
 _partially_persistent_; if you can edit any version it is
 _fully_persistent_. Of course edit here means get a new collection
 representing the edit
 
 For example, if you load the PersistentUnionFind package (no external
 dependencies) and print out this:
 
 | p p1 p2 p3 |
 p := PersistentCollection initially: #(1 2 3).
 p1 := p at: 1 put: 4.
 p2 := p1 at: 2 put: 5.
 p3 := p at: 3 put: 0.
 {p. p1. p2. p3.} collect: #asArray
 
 you get, with inline comments added:
 
   #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0))
 
 So you can see the different versions of the original array: p1 is p
 plus one replacement; p2 is p plus two replacements (or equivalently
 p1 plus one replacement).
 
 You still get efficient access to the latest version, and the old
 versions describe themselves as a stack of undo operations - I'm like
 the latest version, but with these changes.
 
 So if you just print out p1, you'll see:
 
 p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))'
 
 The series of Diffs tell you how to get from the latest version - p3 -
 to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and
 at index 1 put a 1.
 
 A Ref is a delegate: it allows your reference to something to stay the
 same even though the thing referenced changes.
 
 Internally, as you access the latest version of the Collection the
 structure rewrites itself: if it's pointing to a Diff it reroots
 itself. (This does have the issue that if you hang onto two versions
 of the collection and access each version alternately, the
 PersistentCollection will waste a lot of time constantly rerooting
 itself.)
 
 You can see then that keeping a reference to an old version allows you
 to undo arbitrary changes to the collection.
 
 PersistentUnionFind _should_ load cleanly into Pharo: if it doesn't,
 let me know and I'll fix it so it does.
 
 frank
 
 Tx
 
 
 http://www.squeaksource.com/Nutcracker/ is my own play area for these
 sorts of structures - PersistentUnionFind looks like a functional data
 structure while internally massively using side effects.
 http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk
 has references to the original papers I used for my translation.
 
 Okasaki's book is really good reading, if a bit advanced. It also has
 a bunch of stuff beyond just showing functional data structures. (Note
 that the red-black tree implementation is _incomplete_ - he leaves
 deleting nodes as an exercise for the reader. (See
 http://matt.might.net/articles/red-black-delete/))
 
 frank
 
 [1] http://en.wikipedia.org/wiki/Double-ended_queue
 [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29
 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
 
 
 
 
 




Re: [Pharo-project] About linkedlist

2012-02-04 Thread Henrik Johansen

On Feb 3, 2012, at 10:54 PM, Igor Stasenko wrote:

 On 3 February 2012 21:42, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 lukas may be you should send that to the vm mailing-list.
 If you need we can ask igor :) (even if he is busy right now).
 
 implementing it wont be a big deal.
 
 i just trying to understand, what is destructive vs
 non-destructive behavior in
 replaceFrom:to:with:startingAt:
 
receiver replaceFrom: 3 to: 5 with: receiver startingAt: 2.
Destructive:
slots 3-5 all contains originals value from slot 2.
Non-Destructive:
slots 3-5 contains originals values from slots 2-4.

 is it right in my understanding that you want to always meet a
 following criteria:
 
 receiver replaceFrom: a to: b with: replacement startingAt: c.
 
 a to: b do: [:i |
  self assert: (receiver at: i) == (replacement at: c + i - 1) ]
 
 ? (irrespectible, of course if receiver == replacement or not )
Yes.
Basically the same thing you ensure in C by using memmove instead of memcpy.

Cheers.
Henry







Re: [Pharo-project] About linkedlist

2012-02-04 Thread Frank Shearar
On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote:


 I wasn't sure what double-ended versus single-ended meant, and found some
 answer at the obvious place [1]
 which mentions Ada.Containers.Vectors is a dynamic array implementation
 which seems consistent with C++ [2].
 While looking around I happened to bump into [3] which was too much for me
 but may be of interest to anyone playing with data structres.  Skimming this
 shows it discussing efficiency of data structures in functional languages
 together with lazy variants of imperative language data structures.  Would
 such apply to Smalltalk or is the object-oriented style of Smalltalk
 considered imperative rather than functional?

 The standard tool set - OrderedCollection, Array, etc., are not
 functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd
 put them firmly on the imperative side of the fence.

 There's nothing stopping one from writing functional data structures -
 Levente Uzonyi's written some persistent data structures, as have I.
 (Persistent means you get new versions of a data structure; if you
 hang onto those old versions you can roll back your changes.)

 Can you explain a bit more Persistent?

Sure! A persistent data structure is an _apparently_ immutable data
structure: editing the structure returns you a _new_ structure. If you
can modify only the latest version of the structure, it's
_partially_persistent_; if you can edit any version it is
_fully_persistent_. Of course edit here means get a new collection
representing the edit

For example, if you load the PersistentUnionFind package (no external
dependencies) and print out this:

| p p1 p2 p3 |
p := PersistentCollection initially: #(1 2 3).
p1 := p at: 1 put: 4.
p2 := p1 at: 2 put: 5.
p3 := p at: 3 put: 0.
{p. p1. p2. p3.} collect: #asArray

you get, with inline comments added:

   #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0))

So you can see the different versions of the original array: p1 is p
plus one replacement; p2 is p plus two replacements (or equivalently
p1 plus one replacement).

You still get efficient access to the latest version, and the old
versions describe themselves as a stack of undo operations - I'm like
the latest version, but with these changes.

So if you just print out p1, you'll see:

p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))'

The series of Diffs tell you how to get from the latest version - p3 -
to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and
at index 1 put a 1.

A Ref is a delegate: it allows your reference to something to stay the
same even though the thing referenced changes.

Internally, as you access the latest version of the Collection the
structure rewrites itself: if it's pointing to a Diff it reroots
itself. (This does have the issue that if you hang onto two versions
of the collection and access each version alternately, the
PersistentCollection will waste a lot of time constantly rerooting
itself.)

You can see then that keeping a reference to an old version allows you
to undo arbitrary changes to the collection.

PersistentUnionFind _should_ load cleanly into Pharo: if it doesn't,
let me know and I'll fix it so it does.

frank

 Tx


 http://www.squeaksource.com/Nutcracker/ is my own play area for these
 sorts of structures - PersistentUnionFind looks like a functional data
 structure while internally massively using side effects.
 http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk
 has references to the original papers I used for my translation.

 Okasaki's book is really good reading, if a bit advanced. It also has
 a bunch of stuff beyond just showing functional data structures. (Note
 that the red-black tree implementation is _incomplete_ - he leaves
 deleting nodes as an exercise for the reader. (See
 http://matt.might.net/articles/red-black-delete/))

 frank

 [1] http://en.wikipedia.org/wiki/Double-ended_queue
 [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29
 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf







Re: [Pharo-project] About linkedlist

2012-02-03 Thread Lukas Renggli
See below for an alternative collection implementation:

   http://jenkins.lukas-renggli.ch/job/Container/

Also contains a nice double-linked-list, as well as many other
standard containers that are missing in standard Smalltalk.

Lukas

On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 Hi

 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).

 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?

 Stef



-- 
Lukas Renggli
www.lukas-renggli.ch



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Sven Van Caekenberghe

On 03 Feb 2012, at 09:26, Lukas Renggli wrote:

 See below for an alternative collection implementation:
 
   http://jenkins.lukas-renggli.ch/job/Container/
 
 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.

Lukas,

Very interesting !
I already saw that you were working on this ;-)

Do you have some kind of high level description of Container explaining the 
rationale, ideas and approach.
It is one thing to browse the code, but the why/how is important as a guide.

It seems that although you introduce iterators, they are less annoying then 
expected, which is great.
Also, a certain level of backwards compatibility seems to be present, looking 
at message names.

Sven


Re: [Pharo-project] About linkedlist

2012-02-03 Thread Stéphane Ducasse
Lukas this is cool. Do you have a list of class/comment?

Stef

On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote:

 See below for an alternative collection implementation:
 
   http://jenkins.lukas-renggli.ch/job/Container/
 
 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.
 
 Lukas
 
 On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 Hi
 
 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).
 
 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?
 
 Stef
 
 
 
 -- 
 Lukas Renggli
 www.lukas-renggli.ch
 




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Stéphane Ducasse
what is the diff between arrayList and vectorList ?
I have plenty of other questions. :)

Stef


On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote:

 See below for an alternative collection implementation:
 
   http://jenkins.lukas-renggli.ch/job/Container/
 
 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.
 
 Lukas
 
 On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 Hi
 
 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).
 
 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?
 
 Stef
 
 
 
 -- 
 Lukas Renggli
 www.lukas-renggli.ch
 




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Lukas Renggli
This is all very much at an experimental phase.

 what is the diff between arrayList and vectorList ?

ArrayList is double ended (very much like OrderedCollection),
VectorList is single ended. I was just experimenting to see if it
makes a difference.

 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.

The core idea is to split iteration from containment; I think a
horrible design mistake in the Smalltalk-80 collections.

 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.
 It is one thing to browse the code, but the why/how is important as a guide.

Another idea is to split the backing store from the container and
allow the composition of collection properties:

 - uniqueness: yes, no
 - order: natural, custom, indexed
 - access: indexed, keyed
 - structure: hashed, treed, linked
 - weakness: yes, no (not even touched yet)

 It seems that although you introduce iterators, they are less annoying then 
 expected, which is great.

The goal here is to be as lazy as possible and avoid unnecessary loops
over the collections.

Also iterators avoid the explosion of selector combinations like (#do:
and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects.

 Also, a certain level of backwards compatibility seems to be present, looking 
 at message names.

I like the existing names, but then this is not a goal. Also I am not
sure if methods like #add: and #at:put: should really return the
argument.

Essentially the goal is to provide an alternative to the Smalltalk-80
collections by adopting some ideas from modern collection libraries in
Java, C++, Scala, Haskell, etc.

Lukas

On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 what is the diff between arrayList and vectorList ?
 I have plenty of other questions. :)

 Stef


 On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote:

 See below for an alternative collection implementation:

   http://jenkins.lukas-renggli.ch/job/Container/

 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.

 Lukas

 On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 Hi

 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).

 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?

 Stef



 --
 Lukas Renggli
 www.lukas-renggli.ch






-- 
Lukas Renggli
www.lukas-renggli.ch



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Nicolas Cellier
2012/2/3 Lukas Renggli reng...@gmail.com:
 This is all very much at an experimental phase.

 what is the diff between arrayList and vectorList ?

 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.

 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.

 The core idea is to split iteration from containment; I think a
 horrible design mistake in the Smalltalk-80 collections.

 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.
 It is one thing to browse the code, but the why/how is important as a guide.

 Another idea is to split the backing store from the container and
 allow the composition of collection properties:

  - uniqueness: yes, no
  - order: natural, custom, indexed
  - access: indexed, keyed
  - structure: hashed, treed, linked
  - weakness: yes, no (not even touched yet)

 It seems that although you introduce iterators, they are less annoying then 
 expected, which is great.

 The goal here is to be as lazy as possible and avoid unnecessary loops
 over the collections.

 Also iterators avoid the explosion of selector combinations like (#do:
 and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects.

 Also, a certain level of backwards compatibility seems to be present, 
 looking at message names.

 I like the existing names, but then this is not a goal. Also I am not
 sure if methods like #add: and #at:put: should really return the
 argument.


These two are convenient for enabling this kind of DSL instructions

A[i] := B[j] := c

But of course, a dup bytecode would also do the trick...

Nicolas

 Essentially the goal is to provide an alternative to the Smalltalk-80
 collections by adopting some ideas from modern collection libraries in
 Java, C++, Scala, Haskell, etc.

 Lukas

 On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 what is the diff between arrayList and vectorList ?
 I have plenty of other questions. :)

 Stef


 On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote:

 See below for an alternative collection implementation:

   http://jenkins.lukas-renggli.ch/job/Container/

 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.

 Lukas

 On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr 
 wrote:
 Hi

 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).

 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?

 Stef



 --
 Lukas Renggli
 www.lukas-renggli.ch






 --
 Lukas Renggli
 www.lukas-renggli.ch




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Stéphane Ducasse

On Feb 3, 2012, at 12:20 PM, Nicolas Cellier wrote:

 2012/2/3 Lukas Renggli reng...@gmail.com:
 This is all very much at an experimental phase.
 
 what is the diff between arrayList and vectorList ?
 
 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.

ok. I like to see such kind of experiment. 
Was the name based on something that people use?

 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.
 
 The core idea is to split iteration from containment; I think a
 horrible design mistake in the Smalltalk-80 collections.

I thought about that when I read the Iterator. 
I would not say that this is horrible. I think that for default case this is 
not that bad.
Now when you want to navigate a large graph and control the navigation then you 
should have an iterator. 

I'm curious to see what you will get. 

 
 Do you have some kind of high level description of Container explaining the 
 rationale, ideas and approach.
 It is one thing to browse the code, but the why/how is important as a guide.
 
 Another idea is to split the backing store from the container and
 allow the composition of collection properties:
 
  - uniqueness: yes, no
  - order: natural, custom, indexed
  - access: indexed, keyed
  - structure: hashed, treed, linked
  - weakness: yes, no (not even touched yet)

Yes I as to see that because I often want uniqueOrdered (not clear what you do 
when you add twice the same)

 
 It seems that although you introduce iterators, they are less annoying then 
 expected, which is great.
 
 The goal here is to be as lazy as possible and avoid unnecessary loops
 over the collections.
 
 Also iterators avoid the explosion of selector combinations like (#do:
 and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects.
 
 Also, a certain level of backwards compatibility seems to be present, 
 looking at message names.
 
 I like the existing names, but then this is not a goal. Also I am not
 sure if methods like #add: and #at:put: should really return the
 argument.

:)
If I would know :)

Stef

 
 
 These two are convenient for enabling this kind of DSL instructions
 
 A[i] := B[j] := c
 
 But of course, a dup bytecode would also do the trick...
 
 Nicolas
 
 Essentially the goal is to provide an alternative to the Smalltalk-80
 collections by adopting some ideas from modern collection libraries in
 Java, C++, Scala, Haskell, etc.
 
 Lukas
 
 On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 what is the diff between arrayList and vectorList ?
 I have plenty of other questions. :)
 
 Stef
 
 
 On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote:
 
 See below for an alternative collection implementation:
 
   http://jenkins.lukas-renggli.ch/job/Container/
 
 Also contains a nice double-linked-list, as well as many other
 standard containers that are missing in standard Smalltalk.
 
 Lukas
 
 On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr 
 wrote:
 Hi
 
 I was thinking that linkedList is really not to be used by something else 
 than the scheduler (remember the
 problem we got with change == into something else).
 
 So I was wondering if it would not be simpler/safer to design a 
 generalPurpose fast/robust LinkedList too?
 
 Stef
 
 
 
 --
 Lukas Renggli
 www.lukas-renggli.ch
 
 
 
 
 
 
 --
 Lukas Renggli
 www.lukas-renggli.ch
 
 




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Igor Stasenko
On 3 February 2012 13:33, Stéphane Ducasse stephane.duca...@inria.fr wrote:

 On Feb 3, 2012, at 12:20 PM, Nicolas Cellier wrote:

 2012/2/3 Lukas Renggli reng...@gmail.com:
 This is all very much at an experimental phase.

 what is the diff between arrayList and vectorList ?

 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.

 ok. I like to see such kind of experiment.
 Was the name based on something that people use?

 Do you have some kind of high level description of Container explaining 
 the rationale, ideas and approach.

 The core idea is to split iteration from containment; I think a
 horrible design mistake in the Smalltalk-80 collections.

 I thought about that when I read the Iterator.
 I would not say that this is horrible. I think that for default case this is 
 not that bad.
 Now when you want to navigate a large graph and control the navigation then 
 you should have an iterator.

 I'm curious to see what you will get.


 Do you have some kind of high level description of Container explaining 
 the rationale, ideas and approach.
 It is one thing to browse the code, but the why/how is important as a 
 guide.

 Another idea is to split the backing store from the container and
 allow the composition of collection properties:

  - uniqueness: yes, no
  - order: natural, custom, indexed
  - access: indexed, keyed
  - structure: hashed, treed, linked
  - weakness: yes, no (not even touched yet)

 Yes I as to see that because I often want uniqueOrdered (not clear what you 
 do when you add twice the same)


 It seems that although you introduce iterators, they are less annoying 
 then expected, which is great.

 The goal here is to be as lazy as possible and avoid unnecessary loops
 over the collections.

 Also iterators avoid the explosion of selector combinations like (#do:
 and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects.

 Also, a certain level of backwards compatibility seems to be present, 
 looking at message names.

 I like the existing names, but then this is not a goal. Also I am not
 sure if methods like #add: and #at:put: should really return the
 argument.

 :)
 If I would know :)


it is somewhat convenient, because you can do things like:

newItem := coll at: index ifAbsentPut: [ MyItem new ].

and other such forms...

 Stef



-- 
Best regards,
Igor Stasenko.



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Lukas Renggli
 what is the diff between arrayList and vectorList ?

 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.

 ok. I like to see such kind of experiment.
 Was the name based on something that people use?

No, it is a randomly chosen and meaningless name. I guess in the end
there will only be one named ArrayList, but I am not sure which
implementation is better.

Just a note to the VM developers: The primitive
Array#replaceFrom:to:with:startingAt: is absolutely central to the
implementation of efficient collections. While it works well if you
copy from one array to the other, it is broken if source and
destination array are the same and the areas overlap. Would it be
possible to fix this. Even memmove(void*, void*, size_t) in C can
handle this situation correctly.

 Do you have some kind of high level description of Container explaining 
 the rationale, ideas and approach.

 The core idea is to split iteration from containment; I think a
 horrible design mistake in the Smalltalk-80 collections.

 I thought about that when I read the Iterator.
 I would not say that this is horrible. I think that for default case this is 
 not that bad.
 Now when you want to navigate a large graph and control the navigation then 
 you should have an iterator.

I think it is pretty horrible. I can think of at least 4 meaningful
ways to iterate over an simple array (for other collections there are
even more ways):

1. forward: #do:, #collect:, #select:, #detect:, #inject:into:, ...
2. forward with index: #keysAndValuesDo:, #collectWithIndex:,
everything else is not possible
3. backward: #reverseDo:, everything else is not possible
4. backward with index: not possible

If you split the iterator into a separate object, suddenly all of this
becomes trivial. A #collect: remains a #collect: no matter if you are
going forward or backward, or with or without index.

Lukas

-- 
Lukas Renggli
www.lukas-renggli.ch



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Ben Coman

Lukas Renggli wrote:

what is the diff between arrayList and vectorList ?
  

ArrayList is double ended (very much like OrderedCollection),
VectorList is single ended. I was just experimenting to see if it
makes a difference.


ok. I like to see such kind of experiment.
Was the name based on something that people use?



No, it is a randomly chosen and meaningless name. I guess in the end
there will only be one named ArrayList, but I am not sure which
implementation is better.

  
I wasn't sure what double-ended versus single-ended meant, and found 
some answer at the obvious place [1]
which mentions Ada.Containers.Vectors is a dynamic array implementation 
which seems consistent with C++ [2]. 

While looking around I happened to bump into [3] which was too much for 
me but may be of interest to anyone playing with data structres.  
Skimming this shows it discussing efficiency of data structures in 
functional languages together with lazy variants of imperative language 
data structures.  Would such apply to Smalltalk or is the 
object-oriented style of Smalltalk considered imperative rather than 
functional?


[1] http://en.wikipedia.org/wiki/Double-ended_queue
[2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29
[3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Frank Shearar
On 3 February 2012 15:44, Ben Coman b...@openinworld.com wrote:
 Lukas Renggli wrote:

 what is the diff between arrayList and vectorList ?


 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.


 ok. I like to see such kind of experiment.
 Was the name based on something that people use?



 No, it is a randomly chosen and meaningless name. I guess in the end
 there will only be one named ArrayList, but I am not sure which
 implementation is better.



 I wasn't sure what double-ended versus single-ended meant, and found some
 answer at the obvious place [1]
 which mentions Ada.Containers.Vectors is a dynamic array implementation
 which seems consistent with C++ [2].
 While looking around I happened to bump into [3] which was too much for me
 but may be of interest to anyone playing with data structres.  Skimming this
 shows it discussing efficiency of data structures in functional languages
 together with lazy variants of imperative language data structures.  Would
 such apply to Smalltalk or is the object-oriented style of Smalltalk
 considered imperative rather than functional?

The standard tool set - OrderedCollection, Array, etc., are not
functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd
put them firmly on the imperative side of the fence.

There's nothing stopping one from writing functional data structures -
Levente Uzonyi's written some persistent data structures, as have I.
(Persistent means you get new versions of a data structure; if you
hang onto those old versions you can roll back your changes.)
http://www.squeaksource.com/Nutcracker/ is my own play area for these
sorts of structures - PersistentUnionFind looks like a functional data
structure while internally massively using side effects.
http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk
has references to the original papers I used for my translation.

Okasaki's book is really good reading, if a bit advanced. It also has
a bunch of stuff beyond just showing functional data structures. (Note
that the red-black tree implementation is _incomplete_ - he leaves
deleting nodes as an exercise for the reader. (See
http://matt.might.net/articles/red-black-delete/))

frank

 [1] http://en.wikipedia.org/wiki/Double-ended_queue
 [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29
 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Nicolas Cellier
2012/2/3 Lukas Renggli reng...@gmail.com:
 what is the diff between arrayList and vectorList ?

 ArrayList is double ended (very much like OrderedCollection),
 VectorList is single ended. I was just experimenting to see if it
 makes a difference.

 ok. I like to see such kind of experiment.
 Was the name based on something that people use?

 No, it is a randomly chosen and meaningless name. I guess in the end
 there will only be one named ArrayList, but I am not sure which
 implementation is better.

 Just a note to the VM developers: The primitive
 Array#replaceFrom:to:with:startingAt: is absolutely central to the
 implementation of efficient collections. While it works well if you
 copy from one array to the other, it is broken if source and
 destination array are the same and the areas overlap. Would it be
 possible to fix this. Even memmove(void*, void*, size_t) in C can
 handle this situation correctly.


+1, I already asked for this one, this would be useful also for
Xtreams, and would remove the need for some preconditions
http://permalink.gmane.org/gmane.comp.lang.smalltalk.squeak.general/152057.

Nicolas



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Eliot Miranda
On Fri, Feb 3, 2012 at 6:18 AM, Lukas Renggli reng...@gmail.com wrote:

  what is the diff between arrayList and vectorList ?
 
  ArrayList is double ended (very much like OrderedCollection),
  VectorList is single ended. I was just experimenting to see if it
  makes a difference.
 
  ok. I like to see such kind of experiment.
  Was the name based on something that people use?

 No, it is a randomly chosen and meaningless name. I guess in the end
 there will only be one named ArrayList, but I am not sure which
 implementation is better.

 Just a note to the VM developers: The primitive
 Array#replaceFrom:to:with:startingAt: is absolutely central to the
 implementation of efficient collections. While it works well if you
 copy from one array to the other, it is broken if source and
 destination array are the same and the areas overlap. Would it be
 possible to fix this. Even memmove(void*, void*, size_t) in C can
 handle this situation correctly.


The problems are that the primitive is defined to be destructive, and code
could be depending on the destructive update behaviour.  Would it be
feasible to use a new primitive that did the right thing in the case of
overlap, or do you definitely want to change primitive 105?


  Do you have some kind of high level description of Container
 explaining the rationale, ideas and approach.
 
  The core idea is to split iteration from containment; I think a
  horrible design mistake in the Smalltalk-80 collections.
 
  I thought about that when I read the Iterator.
  I would not say that this is horrible. I think that for default case
 this is not that bad.
  Now when you want to navigate a large graph and control the navigation
 then you should have an iterator.

 I think it is pretty horrible. I can think of at least 4 meaningful
 ways to iterate over an simple array (for other collections there are
 even more ways):

1. forward: #do:, #collect:, #select:, #detect:, #inject:into:, ...
2. forward with index: #keysAndValuesDo:, #collectWithIndex:,
 everything else is not possible
3. backward: #reverseDo:, everything else is not possible
4. backward with index: not possible

 If you split the iterator into a separate object, suddenly all of this
 becomes trivial. A #collect: remains a #collect: no matter if you are
 going forward or backward, or with or without index.

 Lukas

 --
 Lukas Renggli
 www.lukas-renggli.ch




-- 
best,
Eliot


Re: [Pharo-project] About linkedlist

2012-02-03 Thread Stéphane Ducasse
 
 
 I wasn't sure what double-ended versus single-ended meant, and found some
 answer at the obvious place [1]
 which mentions Ada.Containers.Vectors is a dynamic array implementation
 which seems consistent with C++ [2].
 While looking around I happened to bump into [3] which was too much for me
 but may be of interest to anyone playing with data structres.  Skimming this
 shows it discussing efficiency of data structures in functional languages
 together with lazy variants of imperative language data structures.  Would
 such apply to Smalltalk or is the object-oriented style of Smalltalk
 considered imperative rather than functional?
 
 The standard tool set - OrderedCollection, Array, etc., are not
 functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd
 put them firmly on the imperative side of the fence.
 
 There's nothing stopping one from writing functional data structures -
 Levente Uzonyi's written some persistent data structures, as have I.
 (Persistent means you get new versions of a data structure; if you
 hang onto those old versions you can roll back your changes.)

Can you explain a bit more Persistent? 

Tx


 http://www.squeaksource.com/Nutcracker/ is my own play area for these
 sorts of structures - PersistentUnionFind looks like a functional data
 structure while internally massively using side effects.
 http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk
 has references to the original papers I used for my translation.
 
 Okasaki's book is really good reading, if a bit advanced. It also has
 a bunch of stuff beyond just showing functional data structures. (Note
 that the red-black tree implementation is _incomplete_ - he leaves
 deleting nodes as an exercise for the reader. (See
 http://matt.might.net/articles/red-black-delete/))
 
 frank
 
 [1] http://en.wikipedia.org/wiki/Double-ended_queue
 [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29
 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
 
 




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Stéphane Ducasse
lukas may be you should send that to the vm mailing-list.
If you need we can ask igor :) (even if he is busy right now).

Stef


On Feb 3, 2012, at 5:35 PM, Nicolas Cellier wrote:

 
 Just a note to the VM developers: The primitive
 Array#replaceFrom:to:with:startingAt: is absolutely central to the
 implementation of efficient collections. While it works well if you
 copy from one array to the other, it is broken if source and
 destination array are the same and the areas overlap. Would it be
 possible to fix this. Even memmove(void*, void*, size_t) in C can
 handle this situation correctly.
 
 




Re: [Pharo-project] About linkedlist

2012-02-03 Thread Igor Stasenko
On 3 February 2012 21:42, Stéphane Ducasse stephane.duca...@inria.fr wrote:
 lukas may be you should send that to the vm mailing-list.
 If you need we can ask igor :) (even if he is busy right now).

implementing it wont be a big deal.

i just trying to understand, what is destructive vs
non-destructive behavior in
replaceFrom:to:with:startingAt:

is it right in my understanding that you want to always meet a
following criteria:

receiver replaceFrom: a to: b with: replacement startingAt: c.

a to: b do: [:i |
  self assert: (receiver at: i) == (replacement at: c + i - 1) ]

? (irrespectible, of course if receiver == replacement or not )


 Stef


 On Feb 3, 2012, at 5:35 PM, Nicolas Cellier wrote:


 Just a note to the VM developers: The primitive
 Array#replaceFrom:to:with:startingAt: is absolutely central to the
 implementation of efficient collections. While it works well if you
 copy from one array to the other, it is broken if source and
 destination array are the same and the areas overlap. Would it be
 possible to fix this. Even memmove(void*, void*, size_t) in C can
 handle this situation correctly.







-- 
Best regards,
Igor Stasenko.



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Igor Stasenko
oh, wait.. a test should be like following:


rCopy := replacement copy.
receiver replaceFrom: a to: b with: replacement startingAt: c.

a to: b do: [:i |
  self assert: (receiver at: i) == (rCopy at: c + i - 1) ]





-- 
Best regards,
Igor Stasenko.



Re: [Pharo-project] About linkedlist

2012-02-03 Thread Igor Stasenko
btw, i doubt that there's code which relying on destructive behavior
of that prim.
because in order to use such destructive behavior for own benefit it
requires a bit of thought effort :)

if you need such erasing (can't even find a proper term for it)
replacement, then it is more
natural that developer will write it by hand, instead of using a
primitive with elusive (and perhaps unintended) behavior.

from other side, making a new prim is better, because then image-side
can do a workarounds if it not there.

and i think we don't even need to implement that prim twice. what we
can do is to fix the implementation of old one and then put it in
additional slot of primitiveTable.


-- 
Best regards,
Igor Stasenko.



[Pharo-project] About linkedlist

2012-02-02 Thread Stéphane Ducasse
Hi 

I was thinking that linkedList is really not to be used by something else than 
the scheduler (remember the 
problem we got with change == into something else).

So I was wondering if it would not be simpler/safer to design a generalPurpose 
fast/robust LinkedList too?

Stef