[racket-users] Re: Persistent vectors: a generic collections proof-of-concept

2015-05-29 Thread Alexey Cherkaev
It looks great! Thanks for putting time and effort into this.

I think Racket in general will benefit from better support of immutable 
collections: it has already made the departure from Scheme by making cons-cells 
immutable. So, this is next logical step.

Maybe this belongs a bit more to your previous thread, but it can also relate 
here:
I see you are making generic sequence interface to collections. Clojure from 
1.7 seems to move more towards reducebles rather than sequable. I have 
played with this idea for Racket a bit and it seems to work quite well (I 
called them streams but in effect they are reducebles):

https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/



On Thursday, May 28, 2015 at 9:59:52 AM UTC+2, Alexis King wrote:
 As a followup to my last thread regarding my generic collections library, I 
 have now created a package that uses it to define an entirely new data 
 structure. I've implemented Clojure's 32-way bitmapped tries to create a 
 persistent vector implementation. If you're interested in trying it out, it's 
 available as a package under the name ‘alexis-pvector’.
 
 Since my collections library only supports immutable collections, Racket's 
 built-in immutable vectors aren't very well-suited to my model. My 
 implementation of `conj` for Racket vectors is not exactly efficient:
 
 (define (conj vec item)
   (vector-immutable-vector
(vector-append vec (vector item
 
 With persistent vectors, the majority of conj operations are truly O(1), and 
 a small portion (about 3%) of them are O(log_32 n), so the performance 
 overhead should be negligible.
 
 Some initial profiling has proved promising:
 
 (time (void (extend #()   (in-range 10
 (time (void (extend (pvector) (in-range 10
 
 ;; output:
 ;; cpu time: 41116 real time: 41336 gc time: 6176
 ;; cpu time: 124 real time: 124 gc time: 3
 
 (define (random-set seq)
   (define l (length seq))
   (for/fold ([seq seq])
 ([i (in-range 10)])
 (set-nth seq (random l) (random
 
 (time (void (random-set (vector-immutable-vector (make-vector 1)
 (time (void (random-set (extend (pvector) (take 1 (in-naturals))
 
 ;; output:
 ;; cpu time: 3674 real time: 3677 gc time: 552
 ;; cpu time: 194 real time: 193 gc time: 8
 
 Admittedly, these tests (especially the first one) are fairly biased towards 
 persistent vectors, but even so, they cover two of the most common vector 
 operations, so I think they're still relevant.
 
 Some rudimentary documentation is available, but the vast majority of the 
 vectors' interface is through the methods provided by alexis/collection. Try 
 them out! Let me know if you find any bugs or if you have any performance 
 considerations.
 
 As a final note, there are still some improvements that definitely need to be 
 made. Currently vectors print as #pvector, which is not very helpful, so 
 implementing a custom write procedure is high on my todo list. Also, this is 
 not an implementation of RRB vectors, which are an improved version of this 
 implementation but are more complex, so I have yet to look into how I could 
 adapt this to use that algorithm.
 
 Let me know what you think!
 
 Alexis

-- 
You received this message because you are subscribed to the Google Groups 
Racket Users group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept

2015-05-29 Thread Alexey Cherkaev
My lazy-sequence implementation was exactly following SICP: while
theoretically pure and good, might not be the best in practice, so I
believe your implementation can be much better.

Major performance boost in stateful implementation compared to
lazy-sequence with delay and force comes from removing essentially
cancelling each other delays and forces. Haskell has stream-fusion that
implements this idea differently and with the help from compiler, but
creates absolutely seamless interface to Haskell-lists — they’ve done very
clever work.

The disadvantage of stateful streams is their complete opaqueness, as at
every stage all they are is just a promise to do the computation.
​


Best regards,
Alexey

On 29 May 2015 at 12:14, Alexis King lexi.lam...@gmail.com wrote:

  Hi, the full code is attached (I hope Google Groups will preserve it...).

 Thank you for this! There is absolutely a performance gap, and I'll
 definitely look over it and see if I can figure out exactly why (I think a
 well-built sequence-based model should have comparable speed). I did
 implement an equivalent test to the one in your blog post, which I've
 included at the end of this email. While the stateful implementation does
 indeed perform noticeably better, the results I got with my lazy
 implementation are not nearly as dire as you seem to have described.

 Specifically, the version operating on 100,000 elements and 1,000,000
 elements yielded the following times, respectively:

 ; cpu time: 774 real time: 774 gc time: 195
 ; cpu time: 7029 real time: 7045 gc time: 1814

 I did not encounter any out-of-memory problems in either test.

 The test I used is as follows:

 #lang racket

 (require alexis/collection)

 (define (sum seq)
   (foldl + 0 seq))

 (define (test-sequence n foo)
   (compose
sum
(curry take n)
(curry map foo)
(curry filter odd?)))

 (time
  ((test-sequence
100
(lambda (x)
  (* x
 (sin (* pi (/ x 12)))
 (+ 1 (exp (- (/ x 100)))
   (in-naturals)))



-- 
You received this message because you are subscribed to the Google Groups 
Racket Users group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept

2015-05-29 Thread Alexey Cherkaev
Hi, the full code is attached (I hope Google Groups will preserve it...).


Best regards,
Alexey

On 29 May 2015 at 11:51, Alexis King lexi.lam...@gmail.com wrote:

  Maybe this belongs a bit more to your previous thread, but it can also
 relate here:
  I see you are making generic sequence interface to collections. Clojure
 from 1.7 seems to move more towards reducebles rather than sequable. I
 have played with this idea for Racket a bit and it seems to work quite well
 (I called them streams but in effect they are reducebles):
 
 
 https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/

 It is my understanding that Clojure's new reducebles interface is not
 intended to replace the existing sequences interface but rather provides an
 alternative, specifically intended to enable parallel folds, trading
 laziness for parallelization. Arbitrary parallel computation is a bit more
 difficult on the Racket VM than on the JVM currently, so I'm not sure if
 that sort of thing would be practical in Racket, but either way, I don't
 believe this is intended to be a general-purpose solution at all.

 Anyway, I'll take a look at them in more detail to see precisely what
 their purpose is. I took a look at the blog post you linked, and I admit
 I'm skeptical that such a model is actually any improvement. Composable
 operations can be a derived concept even with the lazy sequence model, akin
 to Clojure's transducers. I'd also posit that having a simple base model
 for sequences is reasonable rather than imposing a much more complex model
 with possible performance benefits.

 On the other hand, I'd be quite interested in trying your implementation
 to test the performance differences for myself. Do you have that code in a
 single, self-contained snippet I could take a look at? The blog post itself
 seems to be missing a few bits and pieces, notably iterate-list and
 stream-of-integers.

-- 
You received this message because you are subscribed to the Google Groups 
Racket Users group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


new-stream.rkt
Description: Binary data


Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept

2015-05-29 Thread Alexis King
 Hi, the full code is attached (I hope Google Groups will preserve it...).

Thank you for this! There is absolutely a performance gap, and I'll definitely 
look over it and see if I can figure out exactly why (I think a well-built 
sequence-based model should have comparable speed). I did implement an 
equivalent test to the one in your blog post, which I've included at the end of 
this email. While the stateful implementation does indeed perform noticeably 
better, the results I got with my lazy implementation are not nearly as dire as 
you seem to have described.

Specifically, the version operating on 100,000 elements and 1,000,000 elements 
yielded the following times, respectively:

; cpu time: 774 real time: 774 gc time: 195
; cpu time: 7029 real time: 7045 gc time: 1814

I did not encounter any out-of-memory problems in either test.

The test I used is as follows:

#lang racket

(require alexis/collection)

(define (sum seq)
  (foldl + 0 seq))

(define (test-sequence n foo)
  (compose
   sum
   (curry take n)
   (curry map foo)
   (curry filter odd?)))

(time
 ((test-sequence
   100
   (lambda (x)
 (* x
(sin (* pi (/ x 12)))
(+ 1 (exp (- (/ x 100)))
  (in-naturals)))

-- 
You received this message because you are subscribed to the Google Groups 
Racket Users group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept

2015-05-29 Thread Alexis King
 Maybe this belongs a bit more to your previous thread, but it can also relate 
 here:
 I see you are making generic sequence interface to collections. Clojure from 
 1.7 seems to move more towards reducebles rather than sequable. I have 
 played with this idea for Racket a bit and it seems to work quite well (I 
 called them streams but in effect they are reducebles):
 
 https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/

It is my understanding that Clojure's new reducebles interface is not 
intended to replace the existing sequences interface but rather provides an 
alternative, specifically intended to enable parallel folds, trading laziness 
for parallelization. Arbitrary parallel computation is a bit more difficult on 
the Racket VM than on the JVM currently, so I'm not sure if that sort of thing 
would be practical in Racket, but either way, I don't believe this is intended 
to be a general-purpose solution at all.

Anyway, I'll take a look at them in more detail to see precisely what their 
purpose is. I took a look at the blog post you linked, and I admit I'm 
skeptical that such a model is actually any improvement. Composable operations 
can be a derived concept even with the lazy sequence model, akin to Clojure's 
transducers. I'd also posit that having a simple base model for sequences is 
reasonable rather than imposing a much more complex model with possible 
performance benefits.

On the other hand, I'd be quite interested in trying your implementation to 
test the performance differences for myself. Do you have that code in a single, 
self-contained snippet I could take a look at? The blog post itself seems to be 
missing a few bits and pieces, notably iterate-list and stream-of-integers.

-- 
You received this message because you are subscribed to the Google Groups 
Racket Users group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.