[racket-users] Re: Persistent vectors: a generic collections proof-of-concept
It looks great! Thanks for putting time and effort into this. I think Racket in general will benefit from better support of immutable collections: it has already made the departure from Scheme by making cons-cells immutable. So, this is next logical step. Maybe this belongs a bit more to your previous thread, but it can also relate here: I see you are making generic sequence interface to collections. Clojure from 1.7 seems to move more towards reducebles rather than sequable. I have played with this idea for Racket a bit and it seems to work quite well (I called them streams but in effect they are reducebles): https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/ On Thursday, May 28, 2015 at 9:59:52 AM UTC+2, Alexis King wrote: As a followup to my last thread regarding my generic collections library, I have now created a package that uses it to define an entirely new data structure. I've implemented Clojure's 32-way bitmapped tries to create a persistent vector implementation. If you're interested in trying it out, it's available as a package under the name ‘alexis-pvector’. Since my collections library only supports immutable collections, Racket's built-in immutable vectors aren't very well-suited to my model. My implementation of `conj` for Racket vectors is not exactly efficient: (define (conj vec item) (vector-immutable-vector (vector-append vec (vector item With persistent vectors, the majority of conj operations are truly O(1), and a small portion (about 3%) of them are O(log_32 n), so the performance overhead should be negligible. Some initial profiling has proved promising: (time (void (extend #() (in-range 10 (time (void (extend (pvector) (in-range 10 ;; output: ;; cpu time: 41116 real time: 41336 gc time: 6176 ;; cpu time: 124 real time: 124 gc time: 3 (define (random-set seq) (define l (length seq)) (for/fold ([seq seq]) ([i (in-range 10)]) (set-nth seq (random l) (random (time (void (random-set (vector-immutable-vector (make-vector 1) (time (void (random-set (extend (pvector) (take 1 (in-naturals)) ;; output: ;; cpu time: 3674 real time: 3677 gc time: 552 ;; cpu time: 194 real time: 193 gc time: 8 Admittedly, these tests (especially the first one) are fairly biased towards persistent vectors, but even so, they cover two of the most common vector operations, so I think they're still relevant. Some rudimentary documentation is available, but the vast majority of the vectors' interface is through the methods provided by alexis/collection. Try them out! Let me know if you find any bugs or if you have any performance considerations. As a final note, there are still some improvements that definitely need to be made. Currently vectors print as #pvector, which is not very helpful, so implementing a custom write procedure is high on my todo list. Also, this is not an implementation of RRB vectors, which are an improved version of this implementation but are more complex, so I have yet to look into how I could adapt this to use that algorithm. Let me know what you think! Alexis -- You received this message because you are subscribed to the Google Groups Racket Users group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept
My lazy-sequence implementation was exactly following SICP: while theoretically pure and good, might not be the best in practice, so I believe your implementation can be much better. Major performance boost in stateful implementation compared to lazy-sequence with delay and force comes from removing essentially cancelling each other delays and forces. Haskell has stream-fusion that implements this idea differently and with the help from compiler, but creates absolutely seamless interface to Haskell-lists — they’ve done very clever work. The disadvantage of stateful streams is their complete opaqueness, as at every stage all they are is just a promise to do the computation. Best regards, Alexey On 29 May 2015 at 12:14, Alexis King lexi.lam...@gmail.com wrote: Hi, the full code is attached (I hope Google Groups will preserve it...). Thank you for this! There is absolutely a performance gap, and I'll definitely look over it and see if I can figure out exactly why (I think a well-built sequence-based model should have comparable speed). I did implement an equivalent test to the one in your blog post, which I've included at the end of this email. While the stateful implementation does indeed perform noticeably better, the results I got with my lazy implementation are not nearly as dire as you seem to have described. Specifically, the version operating on 100,000 elements and 1,000,000 elements yielded the following times, respectively: ; cpu time: 774 real time: 774 gc time: 195 ; cpu time: 7029 real time: 7045 gc time: 1814 I did not encounter any out-of-memory problems in either test. The test I used is as follows: #lang racket (require alexis/collection) (define (sum seq) (foldl + 0 seq)) (define (test-sequence n foo) (compose sum (curry take n) (curry map foo) (curry filter odd?))) (time ((test-sequence 100 (lambda (x) (* x (sin (* pi (/ x 12))) (+ 1 (exp (- (/ x 100))) (in-naturals))) -- You received this message because you are subscribed to the Google Groups Racket Users group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept
Hi, the full code is attached (I hope Google Groups will preserve it...). Best regards, Alexey On 29 May 2015 at 11:51, Alexis King lexi.lam...@gmail.com wrote: Maybe this belongs a bit more to your previous thread, but it can also relate here: I see you are making generic sequence interface to collections. Clojure from 1.7 seems to move more towards reducebles rather than sequable. I have played with this idea for Racket a bit and it seems to work quite well (I called them streams but in effect they are reducebles): https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/ It is my understanding that Clojure's new reducebles interface is not intended to replace the existing sequences interface but rather provides an alternative, specifically intended to enable parallel folds, trading laziness for parallelization. Arbitrary parallel computation is a bit more difficult on the Racket VM than on the JVM currently, so I'm not sure if that sort of thing would be practical in Racket, but either way, I don't believe this is intended to be a general-purpose solution at all. Anyway, I'll take a look at them in more detail to see precisely what their purpose is. I took a look at the blog post you linked, and I admit I'm skeptical that such a model is actually any improvement. Composable operations can be a derived concept even with the lazy sequence model, akin to Clojure's transducers. I'd also posit that having a simple base model for sequences is reasonable rather than imposing a much more complex model with possible performance benefits. On the other hand, I'd be quite interested in trying your implementation to test the performance differences for myself. Do you have that code in a single, self-contained snippet I could take a look at? The blog post itself seems to be missing a few bits and pieces, notably iterate-list and stream-of-integers. -- You received this message because you are subscribed to the Google Groups Racket Users group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout. new-stream.rkt Description: Binary data
Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept
Hi, the full code is attached (I hope Google Groups will preserve it...). Thank you for this! There is absolutely a performance gap, and I'll definitely look over it and see if I can figure out exactly why (I think a well-built sequence-based model should have comparable speed). I did implement an equivalent test to the one in your blog post, which I've included at the end of this email. While the stateful implementation does indeed perform noticeably better, the results I got with my lazy implementation are not nearly as dire as you seem to have described. Specifically, the version operating on 100,000 elements and 1,000,000 elements yielded the following times, respectively: ; cpu time: 774 real time: 774 gc time: 195 ; cpu time: 7029 real time: 7045 gc time: 1814 I did not encounter any out-of-memory problems in either test. The test I used is as follows: #lang racket (require alexis/collection) (define (sum seq) (foldl + 0 seq)) (define (test-sequence n foo) (compose sum (curry take n) (curry map foo) (curry filter odd?))) (time ((test-sequence 100 (lambda (x) (* x (sin (* pi (/ x 12))) (+ 1 (exp (- (/ x 100))) (in-naturals))) -- You received this message because you are subscribed to the Google Groups Racket Users group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Re: Persistent vectors: a generic collections proof-of-concept
Maybe this belongs a bit more to your previous thread, but it can also relate here: I see you are making generic sequence interface to collections. Clojure from 1.7 seems to move more towards reducebles rather than sequable. I have played with this idea for Racket a bit and it seems to work quite well (I called them streams but in effect they are reducebles): https://mobiusengineering.wordpress.com/2015/01/11/stateful-streams-in-scheme/ It is my understanding that Clojure's new reducebles interface is not intended to replace the existing sequences interface but rather provides an alternative, specifically intended to enable parallel folds, trading laziness for parallelization. Arbitrary parallel computation is a bit more difficult on the Racket VM than on the JVM currently, so I'm not sure if that sort of thing would be practical in Racket, but either way, I don't believe this is intended to be a general-purpose solution at all. Anyway, I'll take a look at them in more detail to see precisely what their purpose is. I took a look at the blog post you linked, and I admit I'm skeptical that such a model is actually any improvement. Composable operations can be a derived concept even with the lazy sequence model, akin to Clojure's transducers. I'd also posit that having a simple base model for sequences is reasonable rather than imposing a much more complex model with possible performance benefits. On the other hand, I'd be quite interested in trying your implementation to test the performance differences for myself. Do you have that code in a single, self-contained snippet I could take a look at? The blog post itself seems to be missing a few bits and pieces, notably iterate-list and stream-of-integers. -- You received this message because you are subscribed to the Google Groups Racket Users group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.