Re: Clojure vectors

2009-07-13 Thread Kevin Downey

the sequence functions operate on sequences. if you pass in something
that is not a sequence, like a vector, they call seq on it internally.
so what you get back from filter or map is a sequence. conj has
consistent behavior across types, you just get a different type out of
map/filter/etc then what goes in.

On Mon, Jul 13, 2009 at 8:15 AM, Jan Rychter wrote:
>
> I've been trying to implement stacks using vectors in Clojure. In case
> you wonder why, it's because I need stacks with a fast item count
> operation. While experimenting I discovered some vector properties which
> were unexpected.
>
> conj adds at the end of the vector as expected:
>
> user> (conj [1 2 3] 4)
> [1 2 3 4]
>
> However, after reading "Programming Clojure" (page 114) I expected to be
> able to use sequence functions, such as drop-last, without my vectors
> changing representation. It seems this is not the case, as after calling
> drop-last (or indeed any of the seq functions) conj no longer adds at
> the end:
>
> user> (conj (drop-last 1 [1 2 3]) 4)
> (4 1 2)
>
> I don't know if it also has performance implications (does it turn my
> vector into a list?). So it turns out that pretty much the only
> operations I can use on my vectors are subvec and pop:
>
> user> (conj (subvec [1 2 3] 0 2) 4)
> [1 2 4]
> user> (conj (pop [1 2 3]) 4)
> [1 2 4]
> user>
>
> I didn't expect this to happen. Most tutorials (and Stuart Halloway's
> book) emphasize the generality of the sequence functions, only
> mentioning that subvec is faster for vectors. I think the fact that they
> also change subsequent behavior of conj is fundamental and should be
> printed in bold type.
>
> Am I missing something here?
>
> And while we're on the subject -- any hints for implementing a stack
> with a fast item count? (apart from a list with a separate counter)
>
> --J.
>
> >
>



-- 
And what is good, Phaedrus,
And what is not good—
Need we ask anyone to tell us these things?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-13 Thread Mark Engelberg

On Mon, Jul 13, 2009 at 8:15 AM, Jan Rychter wrote:
> And while we're on the subject -- any hints for implementing a stack
> with a fast item count? (apart from a list with a separate counter)

Using conj and pop on vectors for stack operations should work just
fine.  Don't use subvec though; nested subvecs will really start to
slow things down.

Under some circumstances, lists cache their count, and therefore a
list implementation may be just as fast, but I can't remember the
exceptions off-hand, so a vector offers more of a promise of the
behavior you want.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-13 Thread Chouser

On Mon, Jul 13, 2009 at 11:15 AM, Jan Rychter wrote:
>
> However, after reading "Programming Clojure" (page 114) I expected to be
> able to use sequence functions, such as drop-last, without my vectors
> changing representation. It seems this is not the case, as after calling
> drop-last (or indeed any of the seq functions) conj no longer adds at
> the end:
>
> user> (conj (drop-last 1 [1 2 3]) 4)
> (4 1 2)

That's correct -- seq functions return seqs, not vectors.

> I don't know if it also has performance implications (does it turn my
> vector into a list?).

drop-last returns a lazy seq, which is why conj'ing onto it
adds items to the left.  Also, lazy-seqs don't know their
size -- counting is a linear time operation.

user=> (class (drop-last 1 [1 2 3]))
clojure.lang.LazySeq
user=> (counted? (drop-last 1 [1 2 3]))
false

> So it turns out that pretty much the only
> operations I can use on my vectors are subvec and pop:
>
> user> (conj (subvec [1 2 3] 0 2) 4)
> [1 2 4]
> user> (conj (pop [1 2 3]) 4)
> [1 2 4]
> user>

As listed here, you can also use 'assoc' and 'replace':
http://clojure.org/data_structures#Vectors

> And while we're on the subject -- any hints for implementing a stack
> with a fast item count? (apart from a list with a separate counter)

Well, you already found 'conj' and 'pop' on vector, and the
link above also lists 'peek' -- is that insufficient?
Vectors can return their size in constant time:

user=> (peek [1 2 3])
3
user=> (counted? [1 2 3])
true

If you don't want the partial-copying that vectors do on
conj an pop, Clojure's normal list type PersistentList also
knows its size and supports 'conj', 'pop', and 'peek':

user=> (class '(3 2 1))
clojure.lang.PersistentList
user=> (conj '(3 2 1) 4)
(4 3 2 1)
user=> (peek '(3 2 1))
3
user=> (pop '(3 2 1))
(2 1)

--Chouser

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-13 Thread Rob



On Jul 13, 11:15 am, Jan Rychter  wrote:
> I've been trying to implement stacks using vectors in Clojure. In case
> you wonder why, it's because I need stacks with a fast item count
> operation.

If that's the only reason, you don't have to use vectors.  The
following page says that 'count' for lists is O(1)

http://clojure.org/data_structures

Rob


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-13 Thread stephaner

Hi

Instead of:
(conj (drop-last 1 [1 2 3]) 4)

You could use into []:

(conj (into [] (drop-last 1 [1 2 3])) 4)
[1 2 4]

Stephane _/)



On Jul 13, 11:15 am, Jan Rychter  wrote:
> I've been trying to implement stacks using vectors in Clojure. In case
> you wonder why, it's because I need stacks with a fast item count
> operation. While experimenting I discovered some vector properties which
> were unexpected.
>
> conj adds at the end of the vector as expected:
>
> user> (conj [1 2 3] 4)
> [1 2 3 4]
>
> However, after reading "Programming Clojure" (page 114) I expected to be
> able to use sequence functions, such as drop-last, without my vectors
> changing representation. It seems this is not the case, as after calling
> drop-last (or indeed any of the seq functions) conj no longer adds at
> the end:
>
> user> (conj (drop-last 1 [1 2 3]) 4)
> (4 1 2)
>
> I don't know if it also has performance implications (does it turn my
> vector into a list?). So it turns out that pretty much the only
> operations I can use on my vectors are subvec and pop:
>
> user> (conj (subvec [1 2 3] 0 2) 4)
> [1 2 4]
> user> (conj (pop [1 2 3]) 4)
> [1 2 4]
> user>
>
> I didn't expect this to happen. Most tutorials (and Stuart Halloway's
> book) emphasize the generality of the sequence functions, only
> mentioning that subvec is faster for vectors. I think the fact that they
> also change subsequent behavior of conj is fundamental and should be
> printed in bold type.
>
> Am I missing something here?
>
> And while we're on the subject -- any hints for implementing a stack
> with a fast item count? (apart from a list with a separate counter)
>
> --J.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-15 Thread Jan Rychter

Thanks to everyone who replied in this thread. I am impressed by the
signal-to-noise ration on this list and by the quality of replies. I
will try to reply to everyone in a single post, trying to summarize.

> On Jul 13, 11:15 am, Jan Rychter  wrote:
>> I've been trying to implement stacks using vectors in Clojure. In case
>> you wonder why, it's because I need stacks with a fast item count
>> operation.

Rob  writes:
> If that's the only reason, you don't have to use vectors.  The
> following page says that 'count' for lists is O(1)

Indeed -- I missed that, and my experiments seemed to indicate that
calling (count) really slows things down. It turns out it was because my
lists were turning into seqs behind my back.

Mark Engelberg  writes:
> Using conj and pop on vectors for stack operations should work just
> fine.  Don't use subvec though; nested subvecs will really start to
> slow things down.

I went ahead and implemented my stacks using both lists and vectors,
then did some careful benchmarking. It turns out there is no discernible
performance difference.

I am not sure what you mean about subvecs, though -- I currently use
them for dropn and rot operations, and I don't know how to avoid using
them:

(defmacro stack-dropn [stack n]
  `(let [stack# ~stack]
 (subvec stack# 0 (- (stack-count ~stack#) ~n

(defmacro stack-rot [stack]
  `(let [stack# ~stack]
 (if (> (stack-count stack#) 1)
   (apply conj (vector (peek stack#)) (subvec stack# 0 (dec (count 
stack#
   stack#)))

(incidentally, if anyone can suggest a better rot implementation, I
would be very grateful -- rot "rotates" the stack, putting the last
element of the vector first)

So, to summarize -- my real performance problem is that my lists were
becoming lazy seqs without me knowing. It turns out you REALLY REALLY
have to know the difference between a list and a seq.

And, since performance of lists and vectors in my case is identical, I
don't know which one to choose now for my stacks?

--J.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-15 Thread Mark Engelberg

On Wed, Jul 15, 2009 at 3:51 AM, Jan Rychter wrote:
> I am not sure what you mean about subvecs, though -- I currently use
> them for dropn and rot operations, and I don't know how to avoid using
> them:

The problem with subvec is that it doesn't really create a "new"
vector, it just adds a level of indirection off of the original
vector.  So for example, if you look up, say, index 2 in the subvec,
it knows to go look up index 5 in the original vec.  If you create a
subvec of a subvec of a subvec, etc. pretty soon every lookup results
in a chain of forty lookups until it gets to the original vector, and
your performance grinds to a halt.

>
> (defmacro stack-dropn [stack n]
>  `(let [stack# ~stack]
>     (subvec stack# 0 (- (stack-count ~stack#) ~n
>

You really don't want to use subvec here.  Just right a function that
pops the stack n times using pop.

> (defmacro stack-rot [stack]
>  `(let [stack# ~stack]
>     (if (> (stack-count stack#) 1)
>       (apply conj (vector (peek stack#)) (subvec stack# 0 (dec (count 
> stack#
>       stack#)))
>
> (incidentally, if anyone can suggest a better rot implementation, I
> would be very grateful -- rot "rotates" the stack, putting the last
> element of the vector first)

Subvec doesn't hurt you so much here, because you're immediately
turning around and using conj to rebuild a whole new vector, so you're
not going to run into a problem with nested subvecs.  Still, you're
probably slightly better off using pop, rather than the subvec.
Another variation (probably similar performance, but maybe easier to
read?) would be to replace the whole line with (vec (cons (peek
stack#) (pop stack#))).

It looks like stack-rot is going to be the bottleneck in your app
since it requires traversing the whole vector to build the new one,
but I think the list-based implementation would be a bit worse, so I
think your choice to use vectors here is sound.

For stack-dropn and stack-rot to both be fast, I think that you really
want to use a deque.  Unfortunately, Clojure doesn't have a built-in
deque, and I don't see one in contrib, but you could look at
clojure.lang.PersistentQueue and read Okasaki's Functional Data
Structures to draw some inspiration.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-15 Thread Mark Engelberg

On Wed, Jul 15, 2009 at 9:58 AM, Mark Engelberg wrote:
> The problem with subvec is that it doesn't really create a "new"
> vector, it just adds a level of indirection off of the original
> vector.  So for example, if you look up, say, index 2 in the subvec,
> it knows to go look up index 5 in the original vec.  If you create a
> subvec of a subvec of a subvec, etc. pretty soon every lookup results
> in a chain of forty lookups until it gets to the original vector, and
> your performance grinds to a halt.

I take this back.  I realized that I was repeating an assertion I have
heard on this list, without confirming it for myself.  Looking at the
code for subvector, it looks like it indeed does the intelligent thing
and a subvec of a subvec points directly at the new start and end
bounds for the original vector.  So there really shouldn't be a
performance penalty for nested subvecs.  So your use of subvec in
stack-dropn is fine.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-15 Thread Chouser

On Wed, Jul 15, 2009 at 6:00 PM, Mark Engelberg wrote:
>
> On Wed, Jul 15, 2009 at 9:58 AM, Mark Engelberg 
> wrote:
>> The problem with subvec is that it doesn't really create a "new"
>> vector, it just adds a level of indirection off of the original
>> vector.  So for example, if you look up, say, index 2 in the subvec,
>> it knows to go look up index 5 in the original vec.  If you create a
>> subvec of a subvec of a subvec, etc. pretty soon every lookup results
>> in a chain of forty lookups until it gets to the original vector, and
>> your performance grinds to a halt.
>
> I take this back.  I realized that I was repeating an assertion I have
> heard on this list, without confirming it for myself.  Looking at the
> code for subvector, it looks like it indeed does the intelligent thing
> and a subvec of a subvec points directly at the new start and end
> bounds for the original vector.

Your original statement was true prior to Feb 28.  Thanks for checking!

--Chouser

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-15 Thread John Harrop
On Wed, Jul 15, 2009 at 12:58 PM, Mark Engelberg
wrote:
>
> It looks like stack-rot is going to be the bottleneck in your app
> since it requires traversing the whole vector to build the new one,
> but I think the list-based implementation would be a bit worse, so I
> think your choice to use vectors here is sound.
>
> For stack-dropn and stack-rot to both be fast, I think that you really
> want to use a deque.  Unfortunately, Clojure doesn't have a built-in
> deque, and I don't see one in contrib, but you could look at
> clojure.lang.PersistentQueue and read Okasaki's Functional Data
> Structures to draw some inspiration.


What's needed here is a ring buffer. It shouldn't be hard to implement one
atop a pair of a clojure vec, a true-count, and a start-offset. Rotation is
then very cheap: replace v with (assoc v (- (+ start-offset true-count)
(count v)) (get v start-offset)) and start-offset with (let [s-o (inc
start-offset)] (if (= (count v) s-o) 0 s-o)). Dropn is similarly: true-count
becomes (- n true-count) and start-offset (mod (+ start-offset n) (count
v)). Indexing is cheap: (get v (mod (+ start-offset (mod index true-count))
(count v))) and peek is even cheaper: (get v start-offset). (Maybe add code
to handle empty, that is, (= 0 true-count)). The tricky thing is push. If (=
true-count (count v)) the vector needs to be copied to a larger vector. You
might want to double the size, leaving the unused elements nil: v becomes
(vec (concat (drop start-offset v) (take start-offset
v) [new-element] (repeat true-count nil))), true-count is inc'd as normal
for a push, and start-offset reset to zero. Doubling the size whenever the
vector must grow causes the cost of copying the vector to be asymptotically
constant per push, instead of linear in the average size of the vector at
the time of a push.

(Ironically, the vector probably has this behavior under the hood, like
java.util.ArrayList, but we need it again because  (vec (concat (drop
start-offset v) (take start-offset v) [new-element] (repeat true-count
nil))) has linear time-complexity in vector length.)

You'd actually want to use a structmap for the above if you want to have
multiple ring buffers, with struct keys :count, :offset, and :vector or
similarly. Depending on the application, you might want to make the values
for these keys be refs and have the ring buffer operations use dosync,
though you could instead use a single ref per instance of the structmap.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Clojure vectors

2009-07-16 Thread Daniel

On Thu, Jul 16, 2009 at 1:52 AM, John Harrop wrote:
> On Wed, Jul 15, 2009 at 12:58 PM, Mark Engelberg 
> wrote:
>>
>> It looks like stack-rot is going to be the bottleneck in your app
>> since it requires traversing the whole vector to build the new one,
>> but I think the list-based implementation would be a bit worse, so I
>> think your choice to use vectors here is sound.
>>
>> For stack-dropn and stack-rot to both be fast, I think that you really
>> want to use a deque.  Unfortunately, Clojure doesn't have a built-in
>> deque, and I don't see one in contrib, but you could look at
>> clojure.lang.PersistentQueue and read Okasaki's Functional Data
>> Structures to draw some inspiration.
>
> What's needed here is a ring buffer. It shouldn't be hard to implement one
> atop a pair of a clojure vec, a true-count, and a start-offset. Rotation is
> then very cheap: replace v with (assoc v (- (+ start-offset true-count)
> (count v)) (get v start-offset)) and start-offset with (let [s-o (inc
> start-offset)] (if (= (count v) s-o) 0 s-o)). Dropn is similarly: true-count
> becomes (- n true-count) and start-offset (mod (+ start-offset n) (count
> v)). Indexing is cheap: (get v (mod (+ start-offset (mod index true-count))
> (count v))) and peek is even cheaper: (get v start-offset). (Maybe add code
> to handle empty, that is, (= 0 true-count)). The tricky thing is push. If (=
> true-count (count v)) the vector needs to be copied to a larger vector. You
> might want to double the size, leaving the unused elements nil: v becomes
> (vec (concat (drop start-offset v) (take start-offset
> v) [new-element] (repeat true-count nil))), true-count is inc'd as normal
> for a push, and start-offset reset to zero. Doubling the size whenever the
> vector must grow causes the cost of copying the vector to be asymptotically
> constant per push, instead of linear in the average size of the vector at
> the time of a push.
> (Ironically, the vector probably has this behavior under the hood, like
> java.util.ArrayList, but we need it again because  (vec (concat (drop
> start-offset v) (take start-offset v) [new-element] (repeat true-count
> nil))) has linear time-complexity in vector length.)
> You'd actually want to use a structmap for the above if you want to have
> multiple ring buffers, with struct keys :count, :offset, and :vector or
> similarly. Depending on the application, you might want to make the values
> for these keys be refs and have the ring buffer operations use dosync,
> though you could instead use a single ref per instance of the structmap.
>

FWIW since Java 1.6 there's a java.util.Deque (implemented as
RingBuffer) interface with java.util.ArrayDeque implementation, and a
java.util.concurrent.BlockingDeque interface with
java.util.concurrent.LinkedBlockingDeque implementation.

With the Deque interface, you can rotate both directions pretty easily
and the rest of the stack like methods are pretty straightforward:

(import '(java.util ArrayDeque))

(let [deque (ArrayDeque. '(1 2 3 4))]
(println deque)
; rot forward
(.addLast deque (.removeFirst deque))
(println deque)
; rot backward
(.addFirst deque (.removeLast deque))
(println deque))

But then again, it's a Java collection and not a Clojure one.

Cheers,
Daniel

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---