Re: Trouble using r/fold to optimise a reduce

2016-04-03 Thread Divyansh Prakash
Thanks a lot for taking the time out to explain this stuff in detail! 
Will go through your solution shortly.

On Sunday, April 3, 2016 at 12:02:48 PM UTC+5:30, Francis Avila wrote:
>
> I had some fun with playing around with faster solutions. 
> https://gist.github.com/favila/0573e3f644dea252bdaaed5be9d1519f
>
> The biggest speedup comes from avoiding set creation in expanded-range 
> (i.e., the function that produces the collection of affected coordinates) 
> and ensuring that the ops run on the accumulating set of on-lights using 
> transients.  clojure.set/* functions require both items be sets and does 
> not use transients internally, so it was much slower.
>
> Another big speedup comes from encoding the light coordinates more 
> efficiently. You can encode a light as a number (in my case, a long, with 
> high bits the x coordinate and low bits the y coordinate) instead of a 
> vector. This creates fewer objects which are easier to hash.
>
> Finally, I tried an approach which doesn't use sets, but instead naively 
> creates a 1000x1000 array of booleans and mutates it in place with every 
> op. This is the fastest approach: 4 seconds on a 2010-era i3! I'm sure a 
> proper matrix library (e.g. core.matrix) could do even better.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Francis Avila
I had some fun with playing around with faster 
solutions. https://gist.github.com/favila/0573e3f644dea252bdaaed5be9d1519f

The biggest speedup comes from avoiding set creation in expanded-range 
(i.e., the function that produces the collection of affected coordinates) 
and ensuring that the ops run on the accumulating set of on-lights using 
transients.  clojure.set/* functions require both items be sets and does 
not use transients internally, so it was much slower.

Another big speedup comes from encoding the light coordinates more 
efficiently. You can encode a light as a number (in my case, a long, with 
high bits the x coordinate and low bits the y coordinate) instead of a 
vector. This creates fewer objects which are easier to hash.

Finally, I tried an approach which doesn't use sets, but instead naively 
creates a 1000x1000 array of booleans and mutates it in place with every 
op. This is the fastest approach: 4 seconds on a 2010-era i3! I'm sure a 
proper matrix library (e.g. core.matrix) could do even better.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Francis Avila


Even stranger - the parallel version seems to at least produce an output 
> (not sure how correct) if run for the first 50 commands instead of 300.
>

I forgot to mention: the reason why r/fold sometimes seems to still work is 
because there is only one chunk (i.e., no parallelism), so the combine 
function is never called.

This is also why r/fold worked when you used a lazy-sequence of commands 
instead of a vector--combine was never called because there was no 
parallelism.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Francis Avila
You misunderstand how r/fold works.

My understanding is that reducers preserve the order of operation, so it 
> should give the correct answer.


This is not quite true. There are two fundamental principles behind 
reducers. The first is that the process of reduction, stripped of all 
non-essentials, is "apply this function individually to every item in a 
collection, accumulating the result". Notice that there is no mention of 
order.

The second principle is, when reduction has been thus "reduced" to its 
essentials, it is possible for a *collection* to control how reductions are 
done over itself. Whether a reduction process has a particular order is a 
non-essential property that is provided or not-provided by the collection 
itself. The IReduceInit protocol is purposefully agnostic to order. Vectors 
implement it in an ordered way, but hash maps have no meaningful order.

So that's IReduceInit. The coll-fold protocol takes advantage of delegating 
the reduction implementation to the collection by allowing collections to 
provide a potentially-parallel reduction strategy. How this works is that 
the collection divides itself up into chunks, runs a normal reduction over 
each chunk using the same init value, then applies a "combine" function to 
combine the chunks. In the case of vectors, the chunks are combined in 
order, so the operation you are performing does not need to be commutative. 
However, the combine function still needs to be *associative*, because each 
individual sub-reduction is not initialized with the result of the previous 
reduction.


Even stranger - the parallel version seems to at least produce an output 
> (not sure how correct) if run for the first 50 commands instead of 300.
>

The problem here is misunderstanding the "combine" step of fold. You supply 
the same function apply-cmd for both reducing and combining. However, the 
reducing function is called with a set of "on" lights and a command, but 
the combine function is called with two sets of on lights (i.e., it is 
called with the result of two reductions). So in the combine phase, r/fold 
is calling apply-cmd something like this: (apply-cmd #{[1 1]} #{[2 2]}), 
which is not what it expects, hence your strange exception about Longs.

Anyway, the combine function must be associative, but this advent problem 
is *not* associative, because you cannot rearrange the order in which the 
toggle functions are run. Each toggle function must always have the final 
state of every previous line done. You could compute the commands *in 
between* the toggle commands in parallel (essentially coalescing them into 
a single "turn on" command), but you still need to run the whole sequence 
of toggle commands, from start to finish, in order.

Here is a simple example to illustrate:

; Our commands. :toggle simplified to illustrate the point.
(def cmds [[:on #{1 2}] [:toggle 2] [:on #{3}]])
;=> #'advent.a6/cmds
; Our reduction function, with initializing arity
(defn apply-cmd
  ([] #{})
  ([on [cmd arg]]
   (case cmd
 :on (into on arg)
 :toggle (if (contains? on arg)
   (disj on arg)
   (conj on arg)
;=> #'advent.a6/apply-cmd
;; The correct answer, for reference
(r/reduce apply-cmd cmds)
;=> #{1 3}




Now let's perform the steps an r/fold would perform. We have three commands 
and will do two chunks, but we will divide the chunks differently. Here is 
what happens when we split between the second and third command:


; Chunk one
(r/reduce apply-cmd (subvec cmds 0 2))
;=> #{1}
; Chunk two
(r/reduce apply-cmd (subvec cmds 2))
;=> #{3}
; Combine
(into *1 *2)
;=> #{1 3}
; Looks ok...


Now lets try splitting between the first and second command:

; Chunk one
(r/reduce apply-cmd (subvec cmds 0 1))
;=> #{1 2}
; Chunk two
(r/reduce apply-cmd (subvec cmds 1))
;=> #{3 2}
;; Combine
(into *1 *2)
;=> #{1 3 2}
; WRONG ANSWER!!


Notice *how you split the commands up* affects the final answer, which 
means this algorithm can't be safely implemented with r/fold!

On Saturday, April 2, 2016 at 6:45:16 PM UTC-5, Divyansh Prakash wrote:
>
> Just verified - it works and gives the correct answer for shorter 
> collections. 
> No performance boost though. And fails (?) on larger collections.
> Not sure what's happening. The foldvec implementation is also pretty hard 
> to understand.
>
> On Sunday, April 3, 2016 at 2:38:26 AM UTC+5:30, Francis Avila wrote:
>>
>> Your input collection to r/fold is provided by cmds-from-input which 
>> returns a lazy-seq (from map) which is not a parallel-foldable type. You 
>> can try mapv instead: vectors are parallel-foldable. (Note only 
>> PersistentVector and PersistentHashMap have useful coll-fold 
>> implementations: all other objects (including sets) fall back on normal 
>> reduction: 
>> https://github.com/clojure/clojure/blob/d5708425995e8c83157ad49007ec2f8f43d8eac8/src/clj/clojure/core/reducers.clj#L347-L367
>> )
>>
>> Additionally, I'm not sure how this algorithm could be parallel

Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Divyansh Prakash
Updated code here 
.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Divyansh Prakash
Just verified - it works and gives the correct answer for shorter 
collections. 
No performance boost though. And fails (?) on larger collections.
Not sure what's happening. The foldvec implementation is also pretty hard 
to understand.

On Sunday, April 3, 2016 at 2:38:26 AM UTC+5:30, Francis Avila wrote:
>
> Your input collection to r/fold is provided by cmds-from-input which 
> returns a lazy-seq (from map) which is not a parallel-foldable type. You 
> can try mapv instead: vectors are parallel-foldable. (Note only 
> PersistentVector and PersistentHashMap have useful coll-fold 
> implementations: all other objects (including sets) fall back on normal 
> reduction: 
> https://github.com/clojure/clojure/blob/d5708425995e8c83157ad49007ec2f8f43d8eac8/src/clj/clojure/core/reducers.clj#L347-L367
> )
>
> Additionally, I'm not sure how this algorithm could be parallelized 
> because the order in which you apply the toggle operation matters! I 
> suspect if you make the mapv change I suggest you will get different final 
> answers.
>
>
>
>
>
>
> On Saturday, April 2, 2016 at 3:24:47 PM UTC-5, Divyansh Prakash wrote:
>>
>> Hi! 
>> I'm solving the problem described here . 
>> I've got the solution 
>> , 
>> but it takes ~50 s to compute.
>> I tried optimising it by replacing the main reduce operation with r/fold, 
>> but it doesn't seem to have any effect on the performance for whatever n 
>> (batch size) I use.
>> Any suggestions?
>>
>> Note: I'm using a MacBook Pro with 8 cores. JDK 7. Clojure 1.8.
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Divyansh Prakash
Even stranger - the parallel version seems to at least produce an output 
(not sure how correct) if run for the first 50 commands instead of 300.

On Sunday, April 3, 2016 at 2:38:26 AM UTC+5:30, Francis Avila wrote:
>
> Your input collection to r/fold is provided by cmds-from-input which 
> returns a lazy-seq (from map) which is not a parallel-foldable type. You 
> can try mapv instead: vectors are parallel-foldable. (Note only 
> PersistentVector and PersistentHashMap have useful coll-fold 
> implementations: all other objects (including sets) fall back on normal 
> reduction: 
> https://github.com/clojure/clojure/blob/d5708425995e8c83157ad49007ec2f8f43d8eac8/src/clj/clojure/core/reducers.clj#L347-L367
> )
>
> Additionally, I'm not sure how this algorithm could be parallelized 
> because the order in which you apply the toggle operation matters! I 
> suspect if you make the mapv change I suggest you will get different final 
> answers.
>
>
>
>
>
>
> On Saturday, April 2, 2016 at 3:24:47 PM UTC-5, Divyansh Prakash wrote:
>>
>> Hi! 
>> I'm solving the problem described here . 
>> I've got the solution 
>> , 
>> but it takes ~50 s to compute.
>> I tried optimising it by replacing the main reduce operation with r/fold, 
>> but it doesn't seem to have any effect on the performance for whatever n 
>> (batch size) I use.
>> Any suggestions?
>>
>> Note: I'm using a MacBook Pro with 8 cores. JDK 7. Clojure 1.8.
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Divyansh Prakash
Thanks a lot, Francis!
I've made the changes you suggest - replaced map with mapv and modified 
apply-cmd to take/return a vector instead of a set (it converts to set 
internally).
My understanding is that reducers preserve the order of operation, so it 
should give the correct answer.
Instead, the simple version now takes ~120 s to run, whereas the r/fold 
version fails with a "UnsupportedOperationException: nth not supported on 
this type: Long" somewhere inside reducers/foldvec/fn. Strange.


On Sunday, April 3, 2016 at 2:38:26 AM UTC+5:30, Francis Avila wrote:
>
> Your input collection to r/fold is provided by cmds-from-input which 
> returns a lazy-seq (from map) which is not a parallel-foldable type. You 
> can try mapv instead: vectors are parallel-foldable. (Note only 
> PersistentVector and PersistentHashMap have useful coll-fold 
> implementations: all other objects (including sets) fall back on normal 
> reduction: 
> https://github.com/clojure/clojure/blob/d5708425995e8c83157ad49007ec2f8f43d8eac8/src/clj/clojure/core/reducers.clj#L347-L367
> )
>
> Additionally, I'm not sure how this algorithm could be parallelized 
> because the order in which you apply the toggle operation matters! I 
> suspect if you make the mapv change I suggest you will get different final 
> answers.
>
>
>
>
>
>
> On Saturday, April 2, 2016 at 3:24:47 PM UTC-5, Divyansh Prakash wrote:
>>
>> Hi! 
>> I'm solving the problem described here . 
>> I've got the solution 
>> , 
>> but it takes ~50 s to compute.
>> I tried optimising it by replacing the main reduce operation with r/fold, 
>> but it doesn't seem to have any effect on the performance for whatever n 
>> (batch size) I use.
>> Any suggestions?
>>
>> Note: I'm using a MacBook Pro with 8 cores. JDK 7. Clojure 1.8.
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trouble using r/fold to optimise a reduce

2016-04-02 Thread Francis Avila
Your input collection to r/fold is provided by cmds-from-input which 
returns a lazy-seq (from map) which is not a parallel-foldable type. You 
can try mapv instead: vectors are parallel-foldable. (Note only 
PersistentVector and PersistentHashMap have useful coll-fold 
implementations: all other objects (including sets) fall back on normal 
reduction: 
https://github.com/clojure/clojure/blob/d5708425995e8c83157ad49007ec2f8f43d8eac8/src/clj/clojure/core/reducers.clj#L347-L367)

Additionally, I'm not sure how this algorithm could be parallelized because 
the order in which you apply the toggle operation matters! I suspect if you 
make the mapv change I suggest you will get different final answers.






On Saturday, April 2, 2016 at 3:24:47 PM UTC-5, Divyansh Prakash wrote:
>
> Hi! 
> I'm solving the problem described here . 
> I've got the solution 
> , 
> but it takes ~50 s to compute.
> I tried optimising it by replacing the main reduce operation with r/fold, 
> but it doesn't seem to have any effect on the performance for whatever n 
> (batch size) I use.
> Any suggestions?
>
> Note: I'm using a MacBook Pro with 8 cores. JDK 7. Clojure 1.8.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.