Re: [Pharo-project] About linkedlist
On 5 February 2012 10:30, Stéphane Ducasse stephane.duca...@inria.fr wrote: Thanks for the explanation frank btw (did you dd it in the class comment because it would be gorgeous). I learned something today so I'm happy. Now what is the typical use case for such persistent structure? Indeed I made sure to write a decent class comment :) The use case is twofold: * any time you want to stop caring about whether you need to copy a collection or can safely pass it around (so avoiding the foo copy addAll: bar; yourself idiom) * any time you want to be able to undo changes to some structure. In particular, I wanted to add an or unifier to my unification library. (Unification is like bidirectional pattern matching. In particular, we build up an equivalence relation over the nodes in two structures. After that, if any partitions in the relation contain variables, we can construct a mapping from variables to concrete values, to give a most general unifier.) This would let me say can X or Y unify against this thing? - (Node left: (Leaf value: #x asVariable)) or: (Node right: (Leaf value: #x asVariable)) =? someRandomTree will * only do something useful if someRandomTree has exactly one child Leaf, and * will tell us what the value of that Leaf is, by telling us how to map from the variable called #x to that value. What I quickly found was that the unification algorithm - which used a normal, non-persistent (ephemeral) union-find - kept its state after attempting to unify the first disjunction clause. That meant that unifying the clause would sometimes fail. What I actually wanted was to be able to say * given some partial term relation U, * try unify the first clause to get a new term relation U1 * but if unification fails, start back with U and try calculate a new term relation U2 on the second clause's nodes Using a _persistent_ union-find lets me thus roll back or undo the term relation construction and revert to an older version. frank Stef On Feb 4, 2012, at 3:33 PM, Frank Shearar wrote: On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote: I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? The standard tool set - OrderedCollection, Array, etc., are not functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd put them firmly on the imperative side of the fence. There's nothing stopping one from writing functional data structures - Levente Uzonyi's written some persistent data structures, as have I. (Persistent means you get new versions of a data structure; if you hang onto those old versions you can roll back your changes.) Can you explain a bit more Persistent? Sure! A persistent data structure is an _apparently_ immutable data structure: editing the structure returns you a _new_ structure. If you can modify only the latest version of the structure, it's _partially_persistent_; if you can edit any version it is _fully_persistent_. Of course edit here means get a new collection representing the edit For example, if you load the PersistentUnionFind package (no external dependencies) and print out this: | p p1 p2 p3 | p := PersistentCollection initially: #(1 2 3). p1 := p at: 1 put: 4. p2 := p1 at: 2 put: 5. p3 := p at: 3 put: 0. {p. p1. p2. p3.} collect: #asArray you get, with inline comments added: #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0)) So you can see the different versions of the original array: p1 is p plus one replacement; p2 is p plus two replacements (or equivalently p1 plus one replacement). You still get efficient access to the latest version, and the old versions describe themselves as a stack of undo operations - I'm like the latest version, but with these changes. So if you just print out p1, you'll see: p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))' The series of Diffs tell you how to get from the latest version - p3 - to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and at index 1 put a 1. A Ref is a delegate: it allows your reference to something to stay the same even though the thing referenced changes. Internally, as you access the latest version of the Collection the structure rewrites itself: if it's pointing to a Diff it reroots itself. (This does have the issue that if you hang onto two versions of the collection and access each version
Re: [Pharo-project] About linkedlist
Hi Stef, Thanks for the explanation frank btw (did you dd it in the class comment because it would be gorgeous). I learned something today so I'm happy. Now what is the typical use case for such persistent structure? Stef Immutable data structures are used extensively in functional languages in which data structures are *im*mutable by default. One benefit of the approach is when multi-threading you don't need to synchronise changes to such structures or objects containing them. For example all Clojure's data-structures are immutable by default [1]. There's been a lot of interesting work on providing performance guarantees so that creating copies avoids scaling with linear time [2], while still having sensible access and iteration performance. [1] http://clojure.org/data_structures#Data%20Structures-Collections [2] http://clojure.org/functional_programming#Functional%20Programming--Immutable%20Data%20Structures Apologies if I'm stating the obvious Nick
Re: [Pharo-project] About linkedlist
Thanks for the explanation frank btw (did you dd it in the class comment because it would be gorgeous). I learned something today so I'm happy. Now what is the typical use case for such persistent structure? Stef On Feb 4, 2012, at 3:33 PM, Frank Shearar wrote: On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote: I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? The standard tool set - OrderedCollection, Array, etc., are not functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd put them firmly on the imperative side of the fence. There's nothing stopping one from writing functional data structures - Levente Uzonyi's written some persistent data structures, as have I. (Persistent means you get new versions of a data structure; if you hang onto those old versions you can roll back your changes.) Can you explain a bit more Persistent? Sure! A persistent data structure is an _apparently_ immutable data structure: editing the structure returns you a _new_ structure. If you can modify only the latest version of the structure, it's _partially_persistent_; if you can edit any version it is _fully_persistent_. Of course edit here means get a new collection representing the edit For example, if you load the PersistentUnionFind package (no external dependencies) and print out this: | p p1 p2 p3 | p := PersistentCollection initially: #(1 2 3). p1 := p at: 1 put: 4. p2 := p1 at: 2 put: 5. p3 := p at: 3 put: 0. {p. p1. p2. p3.} collect: #asArray you get, with inline comments added: #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0)) So you can see the different versions of the original array: p1 is p plus one replacement; p2 is p plus two replacements (or equivalently p1 plus one replacement). You still get efficient access to the latest version, and the old versions describe themselves as a stack of undo operations - I'm like the latest version, but with these changes. So if you just print out p1, you'll see: p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))' The series of Diffs tell you how to get from the latest version - p3 - to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and at index 1 put a 1. A Ref is a delegate: it allows your reference to something to stay the same even though the thing referenced changes. Internally, as you access the latest version of the Collection the structure rewrites itself: if it's pointing to a Diff it reroots itself. (This does have the issue that if you hang onto two versions of the collection and access each version alternately, the PersistentCollection will waste a lot of time constantly rerooting itself.) You can see then that keeping a reference to an old version allows you to undo arbitrary changes to the collection. PersistentUnionFind _should_ load cleanly into Pharo: if it doesn't, let me know and I'll fix it so it does. frank Tx http://www.squeaksource.com/Nutcracker/ is my own play area for these sorts of structures - PersistentUnionFind looks like a functional data structure while internally massively using side effects. http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk has references to the original papers I used for my translation. Okasaki's book is really good reading, if a bit advanced. It also has a bunch of stuff beyond just showing functional data structures. (Note that the red-black tree implementation is _incomplete_ - he leaves deleting nodes as an exercise for the reader. (See http://matt.might.net/articles/red-black-delete/)) frank [1] http://en.wikipedia.org/wiki/Double-ended_queue [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Re: [Pharo-project] About linkedlist
On Feb 3, 2012, at 10:54 PM, Igor Stasenko wrote: On 3 February 2012 21:42, Stéphane Ducasse stephane.duca...@inria.fr wrote: lukas may be you should send that to the vm mailing-list. If you need we can ask igor :) (even if he is busy right now). implementing it wont be a big deal. i just trying to understand, what is destructive vs non-destructive behavior in replaceFrom:to:with:startingAt: receiver replaceFrom: 3 to: 5 with: receiver startingAt: 2. Destructive: slots 3-5 all contains originals value from slot 2. Non-Destructive: slots 3-5 contains originals values from slots 2-4. is it right in my understanding that you want to always meet a following criteria: receiver replaceFrom: a to: b with: replacement startingAt: c. a to: b do: [:i | self assert: (receiver at: i) == (replacement at: c + i - 1) ] ? (irrespectible, of course if receiver == replacement or not ) Yes. Basically the same thing you ensure in C by using memmove instead of memcpy. Cheers. Henry
Re: [Pharo-project] About linkedlist
On 3 February 2012 20:40, Stéphane Ducasse stephane.duca...@inria.fr wrote: I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? The standard tool set - OrderedCollection, Array, etc., are not functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd put them firmly on the imperative side of the fence. There's nothing stopping one from writing functional data structures - Levente Uzonyi's written some persistent data structures, as have I. (Persistent means you get new versions of a data structure; if you hang onto those old versions you can roll back your changes.) Can you explain a bit more Persistent? Sure! A persistent data structure is an _apparently_ immutable data structure: editing the structure returns you a _new_ structure. If you can modify only the latest version of the structure, it's _partially_persistent_; if you can edit any version it is _fully_persistent_. Of course edit here means get a new collection representing the edit For example, if you load the PersistentUnionFind package (no external dependencies) and print out this: | p p1 p2 p3 | p := PersistentCollection initially: #(1 2 3). p1 := p at: 1 put: 4. p2 := p1 at: 2 put: 5. p3 := p at: 3 put: 0. {p. p1. p2. p3.} collect: #asArray you get, with inline comments added: #(p: #(1 2 3) p1: #(4 2 3) p2: #(4 5 3) p3: #(1 2 0)) So you can see the different versions of the original array: p1 is p plus one replacement; p2 is p plus two replacements (or equivalently p1 plus one replacement). You still get efficient access to the latest version, and the old versions describe themselves as a stack of undo operations - I'm like the latest version, but with these changes. So if you just print out p1, you'll see: p1 printString. 'Ref(Diff(1, 4, Ref(Diff(3, 3, Ref(Arr(#(1 2 0)))' The series of Diffs tell you how to get from the latest version - p3 - to the version of p1: take p3 ( #(1 2 0) ), at index 3 put a 3, and at index 1 put a 1. A Ref is a delegate: it allows your reference to something to stay the same even though the thing referenced changes. Internally, as you access the latest version of the Collection the structure rewrites itself: if it's pointing to a Diff it reroots itself. (This does have the issue that if you hang onto two versions of the collection and access each version alternately, the PersistentCollection will waste a lot of time constantly rerooting itself.) You can see then that keeping a reference to an old version allows you to undo arbitrary changes to the collection. PersistentUnionFind _should_ load cleanly into Pharo: if it doesn't, let me know and I'll fix it so it does. frank Tx http://www.squeaksource.com/Nutcracker/ is my own play area for these sorts of structures - PersistentUnionFind looks like a functional data structure while internally massively using side effects. http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk has references to the original papers I used for my translation. Okasaki's book is really good reading, if a bit advanced. It also has a bunch of stuff beyond just showing functional data structures. (Note that the red-black tree implementation is _incomplete_ - he leaves deleting nodes as an exercise for the reader. (See http://matt.might.net/articles/red-black-delete/)) frank [1] http://en.wikipedia.org/wiki/Double-ended_queue [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Re: [Pharo-project] About linkedlist
See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
On 03 Feb 2012, at 09:26, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas, Very interesting ! I already saw that you were working on this ;-) Do you have some kind of high level description of Container explaining the rationale, ideas and approach. It is one thing to browse the code, but the why/how is important as a guide. It seems that although you introduce iterators, they are less annoying then expected, which is great. Also, a certain level of backwards compatibility seems to be present, looking at message names. Sven
Re: [Pharo-project] About linkedlist
Lukas this is cool. Do you have a list of class/comment? Stef On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
what is the diff between arrayList and vectorList ? I have plenty of other questions. :) Stef On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
This is all very much at an experimental phase. what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. It is one thing to browse the code, but the why/how is important as a guide. Another idea is to split the backing store from the container and allow the composition of collection properties: - uniqueness: yes, no - order: natural, custom, indexed - access: indexed, keyed - structure: hashed, treed, linked - weakness: yes, no (not even touched yet) It seems that although you introduce iterators, they are less annoying then expected, which is great. The goal here is to be as lazy as possible and avoid unnecessary loops over the collections. Also iterators avoid the explosion of selector combinations like (#do: and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects. Also, a certain level of backwards compatibility seems to be present, looking at message names. I like the existing names, but then this is not a goal. Also I am not sure if methods like #add: and #at:put: should really return the argument. Essentially the goal is to provide an alternative to the Smalltalk-80 collections by adopting some ideas from modern collection libraries in Java, C++, Scala, Haskell, etc. Lukas On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote: what is the diff between arrayList and vectorList ? I have plenty of other questions. :) Stef On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
2012/2/3 Lukas Renggli reng...@gmail.com: This is all very much at an experimental phase. what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. It is one thing to browse the code, but the why/how is important as a guide. Another idea is to split the backing store from the container and allow the composition of collection properties: - uniqueness: yes, no - order: natural, custom, indexed - access: indexed, keyed - structure: hashed, treed, linked - weakness: yes, no (not even touched yet) It seems that although you introduce iterators, they are less annoying then expected, which is great. The goal here is to be as lazy as possible and avoid unnecessary loops over the collections. Also iterators avoid the explosion of selector combinations like (#do: and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects. Also, a certain level of backwards compatibility seems to be present, looking at message names. I like the existing names, but then this is not a goal. Also I am not sure if methods like #add: and #at:put: should really return the argument. These two are convenient for enabling this kind of DSL instructions A[i] := B[j] := c But of course, a dup bytecode would also do the trick... Nicolas Essentially the goal is to provide an alternative to the Smalltalk-80 collections by adopting some ideas from modern collection libraries in Java, C++, Scala, Haskell, etc. Lukas On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote: what is the diff between arrayList and vectorList ? I have plenty of other questions. :) Stef On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
On Feb 3, 2012, at 12:20 PM, Nicolas Cellier wrote: 2012/2/3 Lukas Renggli reng...@gmail.com: This is all very much at an experimental phase. what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. I thought about that when I read the Iterator. I would not say that this is horrible. I think that for default case this is not that bad. Now when you want to navigate a large graph and control the navigation then you should have an iterator. I'm curious to see what you will get. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. It is one thing to browse the code, but the why/how is important as a guide. Another idea is to split the backing store from the container and allow the composition of collection properties: - uniqueness: yes, no - order: natural, custom, indexed - access: indexed, keyed - structure: hashed, treed, linked - weakness: yes, no (not even touched yet) Yes I as to see that because I often want uniqueOrdered (not clear what you do when you add twice the same) It seems that although you introduce iterators, they are less annoying then expected, which is great. The goal here is to be as lazy as possible and avoid unnecessary loops over the collections. Also iterators avoid the explosion of selector combinations like (#do: and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects. Also, a certain level of backwards compatibility seems to be present, looking at message names. I like the existing names, but then this is not a goal. Also I am not sure if methods like #add: and #at:put: should really return the argument. :) If I would know :) Stef These two are convenient for enabling this kind of DSL instructions A[i] := B[j] := c But of course, a dup bytecode would also do the trick... Nicolas Essentially the goal is to provide an alternative to the Smalltalk-80 collections by adopting some ideas from modern collection libraries in Java, C++, Scala, Haskell, etc. Lukas On 3 February 2012 10:07, Stéphane Ducasse stephane.duca...@inria.fr wrote: what is the diff between arrayList and vectorList ? I have plenty of other questions. :) Stef On Feb 3, 2012, at 9:26 AM, Lukas Renggli wrote: See below for an alternative collection implementation: http://jenkins.lukas-renggli.ch/job/Container/ Also contains a nice double-linked-list, as well as many other standard containers that are missing in standard Smalltalk. Lukas On 3 February 2012 07:52, Stéphane Ducasse stephane.duca...@inria.fr wrote: Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef -- Lukas Renggli www.lukas-renggli.ch -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
On 3 February 2012 13:33, Stéphane Ducasse stephane.duca...@inria.fr wrote: On Feb 3, 2012, at 12:20 PM, Nicolas Cellier wrote: 2012/2/3 Lukas Renggli reng...@gmail.com: This is all very much at an experimental phase. what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. I thought about that when I read the Iterator. I would not say that this is horrible. I think that for default case this is not that bad. Now when you want to navigate a large graph and control the navigation then you should have an iterator. I'm curious to see what you will get. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. It is one thing to browse the code, but the why/how is important as a guide. Another idea is to split the backing store from the container and allow the composition of collection properties: - uniqueness: yes, no - order: natural, custom, indexed - access: indexed, keyed - structure: hashed, treed, linked - weakness: yes, no (not even touched yet) Yes I as to see that because I often want uniqueOrdered (not clear what you do when you add twice the same) It seems that although you introduce iterators, they are less annoying then expected, which is great. The goal here is to be as lazy as possible and avoid unnecessary loops over the collections. Also iterators avoid the explosion of selector combinations like (#do: and #withIndexDo: and #withIndexCollect:, etc) by utilizing objects. Also, a certain level of backwards compatibility seems to be present, looking at message names. I like the existing names, but then this is not a goal. Also I am not sure if methods like #add: and #at:put: should really return the argument. :) If I would know :) it is somewhat convenient, because you can do things like: newItem := coll at: index ifAbsentPut: [ MyItem new ]. and other such forms... Stef -- Best regards, Igor Stasenko.
Re: [Pharo-project] About linkedlist
what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? No, it is a randomly chosen and meaningless name. I guess in the end there will only be one named ArrayList, but I am not sure which implementation is better. Just a note to the VM developers: The primitive Array#replaceFrom:to:with:startingAt: is absolutely central to the implementation of efficient collections. While it works well if you copy from one array to the other, it is broken if source and destination array are the same and the areas overlap. Would it be possible to fix this. Even memmove(void*, void*, size_t) in C can handle this situation correctly. Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. I thought about that when I read the Iterator. I would not say that this is horrible. I think that for default case this is not that bad. Now when you want to navigate a large graph and control the navigation then you should have an iterator. I think it is pretty horrible. I can think of at least 4 meaningful ways to iterate over an simple array (for other collections there are even more ways): 1. forward: #do:, #collect:, #select:, #detect:, #inject:into:, ... 2. forward with index: #keysAndValuesDo:, #collectWithIndex:, everything else is not possible 3. backward: #reverseDo:, everything else is not possible 4. backward with index: not possible If you split the iterator into a separate object, suddenly all of this becomes trivial. A #collect: remains a #collect: no matter if you are going forward or backward, or with or without index. Lukas -- Lukas Renggli www.lukas-renggli.ch
Re: [Pharo-project] About linkedlist
Lukas Renggli wrote: what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? No, it is a randomly chosen and meaningless name. I guess in the end there will only be one named ArrayList, but I am not sure which implementation is better. I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? [1] http://en.wikipedia.org/wiki/Double-ended_queue [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Re: [Pharo-project] About linkedlist
On 3 February 2012 15:44, Ben Coman b...@openinworld.com wrote: Lukas Renggli wrote: what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? No, it is a randomly chosen and meaningless name. I guess in the end there will only be one named ArrayList, but I am not sure which implementation is better. I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? The standard tool set - OrderedCollection, Array, etc., are not functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd put them firmly on the imperative side of the fence. There's nothing stopping one from writing functional data structures - Levente Uzonyi's written some persistent data structures, as have I. (Persistent means you get new versions of a data structure; if you hang onto those old versions you can roll back your changes.) http://www.squeaksource.com/Nutcracker/ is my own play area for these sorts of structures - PersistentUnionFind looks like a functional data structure while internally massively using side effects. http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk has references to the original papers I used for my translation. Okasaki's book is really good reading, if a bit advanced. It also has a bunch of stuff beyond just showing functional data structures. (Note that the red-black tree implementation is _incomplete_ - he leaves deleting nodes as an exercise for the reader. (See http://matt.might.net/articles/red-black-delete/)) frank [1] http://en.wikipedia.org/wiki/Double-ended_queue [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Re: [Pharo-project] About linkedlist
2012/2/3 Lukas Renggli reng...@gmail.com: what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? No, it is a randomly chosen and meaningless name. I guess in the end there will only be one named ArrayList, but I am not sure which implementation is better. Just a note to the VM developers: The primitive Array#replaceFrom:to:with:startingAt: is absolutely central to the implementation of efficient collections. While it works well if you copy from one array to the other, it is broken if source and destination array are the same and the areas overlap. Would it be possible to fix this. Even memmove(void*, void*, size_t) in C can handle this situation correctly. +1, I already asked for this one, this would be useful also for Xtreams, and would remove the need for some preconditions http://permalink.gmane.org/gmane.comp.lang.smalltalk.squeak.general/152057. Nicolas
Re: [Pharo-project] About linkedlist
On Fri, Feb 3, 2012 at 6:18 AM, Lukas Renggli reng...@gmail.com wrote: what is the diff between arrayList and vectorList ? ArrayList is double ended (very much like OrderedCollection), VectorList is single ended. I was just experimenting to see if it makes a difference. ok. I like to see such kind of experiment. Was the name based on something that people use? No, it is a randomly chosen and meaningless name. I guess in the end there will only be one named ArrayList, but I am not sure which implementation is better. Just a note to the VM developers: The primitive Array#replaceFrom:to:with:startingAt: is absolutely central to the implementation of efficient collections. While it works well if you copy from one array to the other, it is broken if source and destination array are the same and the areas overlap. Would it be possible to fix this. Even memmove(void*, void*, size_t) in C can handle this situation correctly. The problems are that the primitive is defined to be destructive, and code could be depending on the destructive update behaviour. Would it be feasible to use a new primitive that did the right thing in the case of overlap, or do you definitely want to change primitive 105? Do you have some kind of high level description of Container explaining the rationale, ideas and approach. The core idea is to split iteration from containment; I think a horrible design mistake in the Smalltalk-80 collections. I thought about that when I read the Iterator. I would not say that this is horrible. I think that for default case this is not that bad. Now when you want to navigate a large graph and control the navigation then you should have an iterator. I think it is pretty horrible. I can think of at least 4 meaningful ways to iterate over an simple array (for other collections there are even more ways): 1. forward: #do:, #collect:, #select:, #detect:, #inject:into:, ... 2. forward with index: #keysAndValuesDo:, #collectWithIndex:, everything else is not possible 3. backward: #reverseDo:, everything else is not possible 4. backward with index: not possible If you split the iterator into a separate object, suddenly all of this becomes trivial. A #collect: remains a #collect: no matter if you are going forward or backward, or with or without index. Lukas -- Lukas Renggli www.lukas-renggli.ch -- best, Eliot
Re: [Pharo-project] About linkedlist
I wasn't sure what double-ended versus single-ended meant, and found some answer at the obvious place [1] which mentions Ada.Containers.Vectors is a dynamic array implementation which seems consistent with C++ [2]. While looking around I happened to bump into [3] which was too much for me but may be of interest to anyone playing with data structres. Skimming this shows it discussing efficiency of data structures in functional languages together with lazy variants of imperative language data structures. Would such apply to Smalltalk or is the object-oriented style of Smalltalk considered imperative rather than functional? The standard tool set - OrderedCollection, Array, etc., are not functional at all - (myArray at: 1 put: 2) mutates myArray, - so I'd put them firmly on the imperative side of the fence. There's nothing stopping one from writing functional data structures - Levente Uzonyi's written some persistent data structures, as have I. (Persistent means you get new versions of a data structure; if you hang onto those old versions you can roll back your changes.) Can you explain a bit more Persistent? Tx http://www.squeaksource.com/Nutcracker/ is my own play area for these sorts of structures - PersistentUnionFind looks like a functional data structure while internally massively using side effects. http://www.lshift.net/blog/2011/12/31/translating-a-persistent-union-find-from-ml-to-smalltalk has references to the original papers I used for my translation. Okasaki's book is really good reading, if a bit advanced. It also has a bunch of stuff beyond just showing functional data structures. (Note that the red-black tree implementation is _incomplete_ - he leaves deleting nodes as an exercise for the reader. (See http://matt.might.net/articles/red-black-delete/)) frank [1] http://en.wikipedia.org/wiki/Double-ended_queue [2] http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29 [3] http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Re: [Pharo-project] About linkedlist
lukas may be you should send that to the vm mailing-list. If you need we can ask igor :) (even if he is busy right now). Stef On Feb 3, 2012, at 5:35 PM, Nicolas Cellier wrote: Just a note to the VM developers: The primitive Array#replaceFrom:to:with:startingAt: is absolutely central to the implementation of efficient collections. While it works well if you copy from one array to the other, it is broken if source and destination array are the same and the areas overlap. Would it be possible to fix this. Even memmove(void*, void*, size_t) in C can handle this situation correctly.
Re: [Pharo-project] About linkedlist
On 3 February 2012 21:42, Stéphane Ducasse stephane.duca...@inria.fr wrote: lukas may be you should send that to the vm mailing-list. If you need we can ask igor :) (even if he is busy right now). implementing it wont be a big deal. i just trying to understand, what is destructive vs non-destructive behavior in replaceFrom:to:with:startingAt: is it right in my understanding that you want to always meet a following criteria: receiver replaceFrom: a to: b with: replacement startingAt: c. a to: b do: [:i | self assert: (receiver at: i) == (replacement at: c + i - 1) ] ? (irrespectible, of course if receiver == replacement or not ) Stef On Feb 3, 2012, at 5:35 PM, Nicolas Cellier wrote: Just a note to the VM developers: The primitive Array#replaceFrom:to:with:startingAt: is absolutely central to the implementation of efficient collections. While it works well if you copy from one array to the other, it is broken if source and destination array are the same and the areas overlap. Would it be possible to fix this. Even memmove(void*, void*, size_t) in C can handle this situation correctly. -- Best regards, Igor Stasenko.
Re: [Pharo-project] About linkedlist
oh, wait.. a test should be like following: rCopy := replacement copy. receiver replaceFrom: a to: b with: replacement startingAt: c. a to: b do: [:i | self assert: (receiver at: i) == (rCopy at: c + i - 1) ] -- Best regards, Igor Stasenko.
Re: [Pharo-project] About linkedlist
btw, i doubt that there's code which relying on destructive behavior of that prim. because in order to use such destructive behavior for own benefit it requires a bit of thought effort :) if you need such erasing (can't even find a proper term for it) replacement, then it is more natural that developer will write it by hand, instead of using a primitive with elusive (and perhaps unintended) behavior. from other side, making a new prim is better, because then image-side can do a workarounds if it not there. and i think we don't even need to implement that prim twice. what we can do is to fix the implementation of old one and then put it in additional slot of primitiveTable. -- Best regards, Igor Stasenko.
[Pharo-project] About linkedlist
Hi I was thinking that linkedList is really not to be used by something else than the scheduler (remember the problem we got with change == into something else). So I was wondering if it would not be simpler/safer to design a generalPurpose fast/robust LinkedList too? Stef