Re: I need help tracking down a performance problem.
No change. I think made a small script that spends a large amount of time in java.lang.reflect.Array.setInt too: (set! *warn-on-reflection* true) (time (dotimes [_ 1] (let [#^ints arr (int-array 200)] (dotimes [i 200] (aset-int arr i i) So maybe I should try to see what I can do about other functions. On Mar 31, 11:04 pm, David Nolen wrote: > 15.1% 0 + 1711 java.lang.reflect.Array.setInt > Is definitely pointing at the aset-int as being the time gobbler, I think > the expression in the macro should be this > > (aset-int (ints arr#) i# (int (~mask-fn (. buf# (~get-fn) > > to be extra safe. > > On Tue, Mar 31, 2009 at 10:00 PM, Vincent Foley wrote: > > > I tried it just now; it made no difference. Nevertheless, thank you > > for you help and time! > > > On Mar 31, 9:38 pm, David Nolen wrote: > > > Did you try > > > (int (mask8 (. buf__2572__auto__ (get > > > > ? > > > > Your macro should like this: > > > > (defmacro make-reader > > > [get-fn mask-fn] > > > `(fn [#^ByteBuffer buf# len#] > > > (if (= len# 1) > > > (~mask-fn (. buf# (~get-fn))) > > > (let [#^"[I" arr# (int-array len#)] > > > (dotimes [i# len#] > > > (aset-int arr# i# (int (~mask-fn (. buf# (~get-fn)) > > > arr# > > > > On Tue, Mar 31, 2009 at 9:09 PM, Vincent Foley wrote: > > > > > I tried surrounding the call to the (. buf# (get)) method and putting > > > > the coercion directly inside the mask8 and mask16 functions. Neither > > > > worked. I want to mention at this point that I have *warn-on- > > > > reflection* set to true for the little script that uses the library > > > > and it doesn't report any call to methods that it can't resolve. > > > > > Here's the complete -Xprof output, if it helps. > > > > > Flat profile of 176.10 secs (11351 total ticks): main > > > > > Interpreted + native Method > > > > 4.5% 511 + 0 java.lang.Integer.hashCode > > > > 1.4% 160 + 0 java.lang.Integer.intValue > > > > 0.8% 91 + 0 starcraft.replay.unpack > > > > $decode_command_block__94.invoke > > > > 0.7% 80 + 0 clojure.lang.Numbers.int_array > > > > 0.2% 25 + 0 clojure.lang.PersistentVector.pushTail > > > > 0.1% 15 + 2 java.lang.ClassLoader.defineClass1 > > > > 0.1% 16 + 0 > > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 > > > > 0.1% 4 + 11 clojure.core__init.load > > > > 0.1% 10 + 0 clojure.lang.PersistentVector.cons > > > > 0.1% 8 + 0 starcraft.replay.actions$fn__71.invoke > > > > 0.1% 8 + 0 > > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection > > > > 0.1% 0 + 7 java.lang.reflect.Array.setInt > > > > 0.1% 7 + 0 clojure.lang.PersistentHashMap > > > > $BitmapIndexedNode.create > > > > 0.1% 7 + 0 clojure.lang.RestFn.invoke > > > > 0.1% 7 + 0 clojure.lang.RestFn.invoke > > > > 0.1% 7 + 0 starcraft.replay.unpack > > > > $decode_commands__99.invoke > > > > 0.1% 7 + 0 starcraft.replay.parse > > > > $parse_buffer__53$fn__56.invoke > > > > 0.1% 6 + 0 clojure.lang.AFn.applyToHelper > > > > 0.1% 6 + 0 clojure.lang.PersistentArrayMap.assoc > > > > 0.1% 6 + 0 clojure.lang.PersistentHashMap > > > > $BitmapIndexedNode.assoc > > > > 0.0% 0 + 5 java.lang.reflect.Array.newArray > > > > 0.0% 0 + 5 java.lang.Class.forName0 > > > > 0.0% 0 + 5 java.util.zip.Inflater.inflateBytes > > > > 0.0% 5 + 0 java.lang.AbstractStringBuilder. > > > > 0.0% 5 + 0 java.util.Arrays.copyOfRange > > > > 10.9% 1157 + 76 Total interpreted (including elided) > > > > > Compiled + native Method > > > > 10.4% 1183 + 1 starcraft.replay.parse$fn__23$fn__49.invoke > > > > 10.0% 1123 + 17 starcraft.replay.unpack > > > > $decode_command_block__94.invoke > > > > 9.2% 1043 + 0 clojure.core$next__3096.invoke > > > > 8.9% 1014 + 0 starcraft.replay.parse > > > > $parse_buffer__53$fn__56.invoke > > > > 5.5% 626 + 0 clojure.lang.PersistentArrayMap.assoc > > > > 4.3% 474 + 17 clojure.lang.PersistentArrayMap.assoc > > > > 4.1% 464 + 7 clojure.lang.RestFn.invoke > > > > 2.9% 333 + 0 clojure.lang.Cons.next > > > > 2.5% 288 + 0 clojure.lang.RT.seq > > > > 2.4% 269 + 0 clojure.lang.AFn.applyToHelper > > > > 2.2% 249 + 0 > > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk > > > > 1.8% 202 + 0 clojure.core$seq__3112.invoke > > > > 1.6% 174 + 3 clojure.lang.RestFn.applyTo > > > > 1.3% 140 + 2 clojure.lang.APersistentMap.cons > > > > 1.2% 130 + 1 clojure.core$spread__3225.invoke > > > > 1.1% 127 + 0 clojure.lang.PersistentStructMap.valAt > > > > 0.8% 93 + 0 clojure.co
Re: I need help tracking down a performance problem.
As as a side note, putting type hints into the body of the macro expansion doesn't do anything. These will be discarded. You will need to coerce types via other methods, investigating how ints is implemented in core.clj is a good place to start. On Tue, Mar 31, 2009 at 11:04 PM, David Nolen wrote: > 15.1% 0 + 1711java.lang.reflect.Array.setInt > Is definitely pointing at the aset-int as being the time gobbler, I think > the expression in the macro should be this > > (aset-int (ints arr#) i# (int (~mask-fn (. buf# (~get-fn) > > to be extra safe. > > On Tue, Mar 31, 2009 at 10:00 PM, Vincent Foley wrote: > >> >> I tried it just now; it made no difference. Nevertheless, thank you >> for you help and time! >> >> On Mar 31, 9:38 pm, David Nolen wrote: >> > Did you try >> > (int (mask8 (. buf__2572__auto__ (get >> > >> > ? >> > >> > Your macro should like this: >> > >> > (defmacro make-reader >> > [get-fn mask-fn] >> > `(fn [#^ByteBuffer buf# len#] >> > (if (= len# 1) >> >(~mask-fn (. buf# (~get-fn))) >> >(let [#^"[I" arr# (int-array len#)] >> > (dotimes [i# len#] >> >(aset-int arr# i# (int (~mask-fn (. buf# (~get-fn)) >> > arr# >> > >> > On Tue, Mar 31, 2009 at 9:09 PM, Vincent Foley >> wrote: >> > >> > > I tried surrounding the call to the (. buf# (get)) method and putting >> > > the coercion directly inside the mask8 and mask16 functions. Neither >> > > worked. I want to mention at this point that I have *warn-on- >> > > reflection* set to true for the little script that uses the library >> > > and it doesn't report any call to methods that it can't resolve. >> > >> > > Here's the complete -Xprof output, if it helps. >> > >> > > Flat profile of 176.10 secs (11351 total ticks): main >> > >> > > Interpreted + native Method >> > > 4.5% 511 + 0java.lang.Integer.hashCode >> > > 1.4% 160 + 0java.lang.Integer.intValue >> > > 0.8%91 + 0starcraft.replay.unpack >> > > $decode_command_block__94.invoke >> > > 0.7%80 + 0clojure.lang.Numbers.int_array >> > > 0.2%25 + 0clojure.lang.PersistentVector.pushTail >> > > 0.1%15 + 2java.lang.ClassLoader.defineClass1 >> > > 0.1%16 + 0 >> > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 >> > > 0.1% 4 +11clojure.core__init.load >> > > 0.1%10 + 0clojure.lang.PersistentVector.cons >> > > 0.1% 8 + 0starcraft.replay.actions$fn__71.invoke >> > > 0.1% 8 + 0 >> > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection >> > > 0.1% 0 + 7java.lang.reflect.Array.setInt >> > > 0.1% 7 + 0clojure.lang.PersistentHashMap >> > > $BitmapIndexedNode.create >> > > 0.1% 7 + 0clojure.lang.RestFn.invoke >> > > 0.1% 7 + 0clojure.lang.RestFn.invoke >> > > 0.1% 7 + 0starcraft.replay.unpack >> > > $decode_commands__99.invoke >> > > 0.1% 7 + 0starcraft.replay.parse >> > > $parse_buffer__53$fn__56.invoke >> > > 0.1% 6 + 0clojure.lang.AFn.applyToHelper >> > > 0.1% 6 + 0clojure.lang.PersistentArrayMap.assoc >> > > 0.1% 6 + 0clojure.lang.PersistentHashMap >> > > $BitmapIndexedNode.assoc >> > > 0.0% 0 + 5java.lang.reflect.Array.newArray >> > > 0.0% 0 + 5java.lang.Class.forName0 >> > > 0.0% 0 + 5java.util.zip.Inflater.inflateBytes >> > > 0.0% 5 + 0java.lang.AbstractStringBuilder. >> > > 0.0% 5 + 0java.util.Arrays.copyOfRange >> > > 10.9% 1157 +76Total interpreted (including elided) >> > >> > > Compiled + native Method >> > > 10.4% 1183 + 1starcraft.replay.parse$fn__23$fn__49.invoke >> > > 10.0% 1123 +17starcraft.replay.unpack >> > > $decode_command_block__94.invoke >> > > 9.2% 1043 + 0clojure.core$next__3096.invoke >> > > 8.9% 1014 + 0starcraft.replay.parse >> > > $parse_buffer__53$fn__56.invoke >> > > 5.5% 626 + 0clojure.lang.PersistentArrayMap.assoc >> > > 4.3% 474 +17clojure.lang.PersistentArrayMap.assoc >> > > 4.1% 464 + 7clojure.lang.RestFn.invoke >> > > 2.9% 333 + 0clojure.lang.Cons.next >> > > 2.5% 288 + 0clojure.lang.RT.seq >> > > 2.4% 269 + 0clojure.lang.AFn.applyToHelper >> > > 2.2% 249 + 0 >> > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk >> > > 1.8% 202 + 0clojure.core$seq__3112.invoke >> > > 1.6% 174 + 3clojure.lang.RestFn.applyTo >> > > 1.3% 140 + 2clojure.lang.APersistentMap.cons >> > > 1.2% 130 + 1clojure.core$spread__3225.invoke >> > > 1.1% 127 + 0clojure.lang.PersistentStructMap.valAt >> > > 0.8%93 + 0clojure.core$reduce__3304.invoke >> > > 0.6%66 + 2starcraft.replay.unpack >> > > $decode_commands__99.invoke >> > >
Re: I need help tracking down a performance problem.
15.1% 0 + 1711java.lang.reflect.Array.setInt Is definitely pointing at the aset-int as being the time gobbler, I think the expression in the macro should be this (aset-int (ints arr#) i# (int (~mask-fn (. buf# (~get-fn) to be extra safe. On Tue, Mar 31, 2009 at 10:00 PM, Vincent Foley wrote: > > I tried it just now; it made no difference. Nevertheless, thank you > for you help and time! > > On Mar 31, 9:38 pm, David Nolen wrote: > > Did you try > > (int (mask8 (. buf__2572__auto__ (get > > > > ? > > > > Your macro should like this: > > > > (defmacro make-reader > > [get-fn mask-fn] > > `(fn [#^ByteBuffer buf# len#] > > (if (= len# 1) > >(~mask-fn (. buf# (~get-fn))) > >(let [#^"[I" arr# (int-array len#)] > > (dotimes [i# len#] > >(aset-int arr# i# (int (~mask-fn (. buf# (~get-fn)) > > arr# > > > > On Tue, Mar 31, 2009 at 9:09 PM, Vincent Foley wrote: > > > > > I tried surrounding the call to the (. buf# (get)) method and putting > > > the coercion directly inside the mask8 and mask16 functions. Neither > > > worked. I want to mention at this point that I have *warn-on- > > > reflection* set to true for the little script that uses the library > > > and it doesn't report any call to methods that it can't resolve. > > > > > Here's the complete -Xprof output, if it helps. > > > > > Flat profile of 176.10 secs (11351 total ticks): main > > > > > Interpreted + native Method > > > 4.5% 511 + 0java.lang.Integer.hashCode > > > 1.4% 160 + 0java.lang.Integer.intValue > > > 0.8%91 + 0starcraft.replay.unpack > > > $decode_command_block__94.invoke > > > 0.7%80 + 0clojure.lang.Numbers.int_array > > > 0.2%25 + 0clojure.lang.PersistentVector.pushTail > > > 0.1%15 + 2java.lang.ClassLoader.defineClass1 > > > 0.1%16 + 0 > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 > > > 0.1% 4 +11clojure.core__init.load > > > 0.1%10 + 0clojure.lang.PersistentVector.cons > > > 0.1% 8 + 0starcraft.replay.actions$fn__71.invoke > > > 0.1% 8 + 0 > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection > > > 0.1% 0 + 7java.lang.reflect.Array.setInt > > > 0.1% 7 + 0clojure.lang.PersistentHashMap > > > $BitmapIndexedNode.create > > > 0.1% 7 + 0clojure.lang.RestFn.invoke > > > 0.1% 7 + 0clojure.lang.RestFn.invoke > > > 0.1% 7 + 0starcraft.replay.unpack > > > $decode_commands__99.invoke > > > 0.1% 7 + 0starcraft.replay.parse > > > $parse_buffer__53$fn__56.invoke > > > 0.1% 6 + 0clojure.lang.AFn.applyToHelper > > > 0.1% 6 + 0clojure.lang.PersistentArrayMap.assoc > > > 0.1% 6 + 0clojure.lang.PersistentHashMap > > > $BitmapIndexedNode.assoc > > > 0.0% 0 + 5java.lang.reflect.Array.newArray > > > 0.0% 0 + 5java.lang.Class.forName0 > > > 0.0% 0 + 5java.util.zip.Inflater.inflateBytes > > > 0.0% 5 + 0java.lang.AbstractStringBuilder. > > > 0.0% 5 + 0java.util.Arrays.copyOfRange > > > 10.9% 1157 +76Total interpreted (including elided) > > > > > Compiled + native Method > > > 10.4% 1183 + 1starcraft.replay.parse$fn__23$fn__49.invoke > > > 10.0% 1123 +17starcraft.replay.unpack > > > $decode_command_block__94.invoke > > > 9.2% 1043 + 0clojure.core$next__3096.invoke > > > 8.9% 1014 + 0starcraft.replay.parse > > > $parse_buffer__53$fn__56.invoke > > > 5.5% 626 + 0clojure.lang.PersistentArrayMap.assoc > > > 4.3% 474 +17clojure.lang.PersistentArrayMap.assoc > > > 4.1% 464 + 7clojure.lang.RestFn.invoke > > > 2.9% 333 + 0clojure.lang.Cons.next > > > 2.5% 288 + 0clojure.lang.RT.seq > > > 2.4% 269 + 0clojure.lang.AFn.applyToHelper > > > 2.2% 249 + 0 > > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk > > > 1.8% 202 + 0clojure.core$seq__3112.invoke > > > 1.6% 174 + 3clojure.lang.RestFn.applyTo > > > 1.3% 140 + 2clojure.lang.APersistentMap.cons > > > 1.2% 130 + 1clojure.core$spread__3225.invoke > > > 1.1% 127 + 0clojure.lang.PersistentStructMap.valAt > > > 0.8%93 + 0clojure.core$reduce__3304.invoke > > > 0.6%66 + 2starcraft.replay.unpack > > > $decode_commands__99.invoke > > > 0.6%63 + 0clojure.lang.PersistentArrayMap.valAt > > > 0.1%13 + 1clojure.core$conj__3100.invoke > > > 0.1% 9 + 0clojure.lang.APersistentMap.invoke > > > 0.1% 3 + 6starcraft.replay.parse > > > $fn__23$read_shorts__37.invoke > > > 0.1% 8 + 0clojure.core$nthnext__4405.invoke > > > 0.1% 0 + 7clojure.lang.ArraySeq.next > > > 0.0%
Re: I need help tracking down a performance problem.
I tried it just now; it made no difference. Nevertheless, thank you for you help and time! On Mar 31, 9:38 pm, David Nolen wrote: > Did you try > (int (mask8 (. buf__2572__auto__ (get > > ? > > Your macro should like this: > > (defmacro make-reader > [get-fn mask-fn] > `(fn [#^ByteBuffer buf# len#] > (if (= len# 1) > (~mask-fn (. buf# (~get-fn))) > (let [#^"[I" arr# (int-array len#)] > (dotimes [i# len#] > (aset-int arr# i# (int (~mask-fn (. buf# (~get-fn)) > arr# > > On Tue, Mar 31, 2009 at 9:09 PM, Vincent Foley wrote: > > > I tried surrounding the call to the (. buf# (get)) method and putting > > the coercion directly inside the mask8 and mask16 functions. Neither > > worked. I want to mention at this point that I have *warn-on- > > reflection* set to true for the little script that uses the library > > and it doesn't report any call to methods that it can't resolve. > > > Here's the complete -Xprof output, if it helps. > > > Flat profile of 176.10 secs (11351 total ticks): main > > > Interpreted + native Method > > 4.5% 511 + 0 java.lang.Integer.hashCode > > 1.4% 160 + 0 java.lang.Integer.intValue > > 0.8% 91 + 0 starcraft.replay.unpack > > $decode_command_block__94.invoke > > 0.7% 80 + 0 clojure.lang.Numbers.int_array > > 0.2% 25 + 0 clojure.lang.PersistentVector.pushTail > > 0.1% 15 + 2 java.lang.ClassLoader.defineClass1 > > 0.1% 16 + 0 > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 > > 0.1% 4 + 11 clojure.core__init.load > > 0.1% 10 + 0 clojure.lang.PersistentVector.cons > > 0.1% 8 + 0 starcraft.replay.actions$fn__71.invoke > > 0.1% 8 + 0 > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection > > 0.1% 0 + 7 java.lang.reflect.Array.setInt > > 0.1% 7 + 0 clojure.lang.PersistentHashMap > > $BitmapIndexedNode.create > > 0.1% 7 + 0 clojure.lang.RestFn.invoke > > 0.1% 7 + 0 clojure.lang.RestFn.invoke > > 0.1% 7 + 0 starcraft.replay.unpack > > $decode_commands__99.invoke > > 0.1% 7 + 0 starcraft.replay.parse > > $parse_buffer__53$fn__56.invoke > > 0.1% 6 + 0 clojure.lang.AFn.applyToHelper > > 0.1% 6 + 0 clojure.lang.PersistentArrayMap.assoc > > 0.1% 6 + 0 clojure.lang.PersistentHashMap > > $BitmapIndexedNode.assoc > > 0.0% 0 + 5 java.lang.reflect.Array.newArray > > 0.0% 0 + 5 java.lang.Class.forName0 > > 0.0% 0 + 5 java.util.zip.Inflater.inflateBytes > > 0.0% 5 + 0 java.lang.AbstractStringBuilder. > > 0.0% 5 + 0 java.util.Arrays.copyOfRange > > 10.9% 1157 + 76 Total interpreted (including elided) > > > Compiled + native Method > > 10.4% 1183 + 1 starcraft.replay.parse$fn__23$fn__49.invoke > > 10.0% 1123 + 17 starcraft.replay.unpack > > $decode_command_block__94.invoke > > 9.2% 1043 + 0 clojure.core$next__3096.invoke > > 8.9% 1014 + 0 starcraft.replay.parse > > $parse_buffer__53$fn__56.invoke > > 5.5% 626 + 0 clojure.lang.PersistentArrayMap.assoc > > 4.3% 474 + 17 clojure.lang.PersistentArrayMap.assoc > > 4.1% 464 + 7 clojure.lang.RestFn.invoke > > 2.9% 333 + 0 clojure.lang.Cons.next > > 2.5% 288 + 0 clojure.lang.RT.seq > > 2.4% 269 + 0 clojure.lang.AFn.applyToHelper > > 2.2% 249 + 0 > > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk > > 1.8% 202 + 0 clojure.core$seq__3112.invoke > > 1.6% 174 + 3 clojure.lang.RestFn.applyTo > > 1.3% 140 + 2 clojure.lang.APersistentMap.cons > > 1.2% 130 + 1 clojure.core$spread__3225.invoke > > 1.1% 127 + 0 clojure.lang.PersistentStructMap.valAt > > 0.8% 93 + 0 clojure.core$reduce__3304.invoke > > 0.6% 66 + 2 starcraft.replay.unpack > > $decode_commands__99.invoke > > 0.6% 63 + 0 clojure.lang.PersistentArrayMap.valAt > > 0.1% 13 + 1 clojure.core$conj__3100.invoke > > 0.1% 9 + 0 clojure.lang.APersistentMap.invoke > > 0.1% 3 + 6 starcraft.replay.parse > > $fn__23$read_shorts__37.invoke > > 0.1% 8 + 0 clojure.core$nthnext__4405.invoke > > 0.1% 0 + 7 clojure.lang.ArraySeq.next > > 0.0% 0 + 5 clojure.lang.APersistentVector.assoc > > 72.3% 8126 + 76 Total compiled (including elided) > > > Stub + native Method > > 15.1% 0 + 1711 java.lang.reflect.Array.setInt > > 1.2% 0 + 135 java.lang.System.arraycopy > > 0.3% 0 + 31 java.lang.reflect.Array.set > > 0.1% 0 + 15 java.io.FileInputStream.readBytes > > 0.1% 0 + 13 java.lang.reflect.Array.get > > 0.1% 0 + 7 java.lang.Object.getClass > > 0.0%
Re: I need help tracking down a performance problem.
Did you try (int (mask8 (. buf__2572__auto__ (get ? Your macro should like this: (defmacro make-reader [get-fn mask-fn] `(fn [#^ByteBuffer buf# len#] (if (= len# 1) (~mask-fn (. buf# (~get-fn))) (let [#^"[I" arr# (int-array len#)] (dotimes [i# len#] (aset-int arr# i# (int (~mask-fn (. buf# (~get-fn)) arr# On Tue, Mar 31, 2009 at 9:09 PM, Vincent Foley wrote: > > I tried surrounding the call to the (. buf# (get)) method and putting > the coercion directly inside the mask8 and mask16 functions. Neither > worked. I want to mention at this point that I have *warn-on- > reflection* set to true for the little script that uses the library > and it doesn't report any call to methods that it can't resolve. > > Here's the complete -Xprof output, if it helps. > > > Flat profile of 176.10 secs (11351 total ticks): main > > Interpreted + native Method > 4.5% 511 + 0java.lang.Integer.hashCode > 1.4% 160 + 0java.lang.Integer.intValue > 0.8%91 + 0starcraft.replay.unpack > $decode_command_block__94.invoke > 0.7%80 + 0clojure.lang.Numbers.int_array > 0.2%25 + 0clojure.lang.PersistentVector.pushTail > 0.1%15 + 2java.lang.ClassLoader.defineClass1 > 0.1%16 + 0 > hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 > 0.1% 4 +11clojure.core__init.load > 0.1%10 + 0clojure.lang.PersistentVector.cons > 0.1% 8 + 0starcraft.replay.actions$fn__71.invoke > 0.1% 8 + 0 > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection > 0.1% 0 + 7java.lang.reflect.Array.setInt > 0.1% 7 + 0clojure.lang.PersistentHashMap > $BitmapIndexedNode.create > 0.1% 7 + 0clojure.lang.RestFn.invoke > 0.1% 7 + 0clojure.lang.RestFn.invoke > 0.1% 7 + 0starcraft.replay.unpack > $decode_commands__99.invoke > 0.1% 7 + 0starcraft.replay.parse > $parse_buffer__53$fn__56.invoke > 0.1% 6 + 0clojure.lang.AFn.applyToHelper > 0.1% 6 + 0clojure.lang.PersistentArrayMap.assoc > 0.1% 6 + 0clojure.lang.PersistentHashMap > $BitmapIndexedNode.assoc > 0.0% 0 + 5java.lang.reflect.Array.newArray > 0.0% 0 + 5java.lang.Class.forName0 > 0.0% 0 + 5java.util.zip.Inflater.inflateBytes > 0.0% 5 + 0java.lang.AbstractStringBuilder. > 0.0% 5 + 0java.util.Arrays.copyOfRange > 10.9% 1157 +76Total interpreted (including elided) > > Compiled + native Method > 10.4% 1183 + 1starcraft.replay.parse$fn__23$fn__49.invoke > 10.0% 1123 +17starcraft.replay.unpack > $decode_command_block__94.invoke > 9.2% 1043 + 0clojure.core$next__3096.invoke > 8.9% 1014 + 0starcraft.replay.parse > $parse_buffer__53$fn__56.invoke > 5.5% 626 + 0clojure.lang.PersistentArrayMap.assoc > 4.3% 474 +17clojure.lang.PersistentArrayMap.assoc > 4.1% 464 + 7clojure.lang.RestFn.invoke > 2.9% 333 + 0clojure.lang.Cons.next > 2.5% 288 + 0clojure.lang.RT.seq > 2.4% 269 + 0clojure.lang.AFn.applyToHelper > 2.2% 249 + 0 > hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk > 1.8% 202 + 0clojure.core$seq__3112.invoke > 1.6% 174 + 3clojure.lang.RestFn.applyTo > 1.3% 140 + 2clojure.lang.APersistentMap.cons > 1.2% 130 + 1clojure.core$spread__3225.invoke > 1.1% 127 + 0clojure.lang.PersistentStructMap.valAt > 0.8%93 + 0clojure.core$reduce__3304.invoke > 0.6%66 + 2starcraft.replay.unpack > $decode_commands__99.invoke > 0.6%63 + 0clojure.lang.PersistentArrayMap.valAt > 0.1%13 + 1clojure.core$conj__3100.invoke > 0.1% 9 + 0clojure.lang.APersistentMap.invoke > 0.1% 3 + 6starcraft.replay.parse > $fn__23$read_shorts__37.invoke > 0.1% 8 + 0clojure.core$nthnext__4405.invoke > 0.1% 0 + 7clojure.lang.ArraySeq.next > 0.0% 0 + 5clojure.lang.APersistentVector.assoc > 72.3% 8126 +76Total compiled (including elided) > > Stub + native Method > 15.1% 0 + 1711java.lang.reflect.Array.setInt > 1.2% 0 + 135java.lang.System.arraycopy > 0.3% 0 +31java.lang.reflect.Array.set > 0.1% 0 +15java.io.FileInputStream.readBytes > 0.1% 0 +13java.lang.reflect.Array.get > 0.1% 0 + 7java.lang.Object.getClass > 0.0% 0 + 1java.lang.Thread.currentThread > 16.9% 0 + 1913Total stub > > Thread-local ticks: > 0.0% 1 Class loader > 0.0% 2 Unknown: no last frame > > > Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM > > Thread-local ticks: > 100.0% 1 Blocked (of
Re: I need help tracking down a performance problem.
I tried surrounding the call to the (. buf# (get)) method and putting the coercion directly inside the mask8 and mask16 functions. Neither worked. I want to mention at this point that I have *warn-on- reflection* set to true for the little script that uses the library and it doesn't report any call to methods that it can't resolve. Here's the complete -Xprof output, if it helps. Flat profile of 176.10 secs (11351 total ticks): main Interpreted + native Method 4.5% 511 + 0java.lang.Integer.hashCode 1.4% 160 + 0java.lang.Integer.intValue 0.8%91 + 0starcraft.replay.unpack $decode_command_block__94.invoke 0.7%80 + 0clojure.lang.Numbers.int_array 0.2%25 + 0clojure.lang.PersistentVector.pushTail 0.1%15 + 2java.lang.ClassLoader.defineClass1 0.1%16 + 0 hu.belicza.andras.bwhf.control.BinReplayUnpacker.esi28 0.1% 4 +11clojure.core__init.load 0.1%10 + 0clojure.lang.PersistentVector.cons 0.1% 8 + 0starcraft.replay.actions$fn__71.invoke 0.1% 8 + 0 hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackSection 0.1% 0 + 7java.lang.reflect.Array.setInt 0.1% 7 + 0clojure.lang.PersistentHashMap $BitmapIndexedNode.create 0.1% 7 + 0clojure.lang.RestFn.invoke 0.1% 7 + 0clojure.lang.RestFn.invoke 0.1% 7 + 0starcraft.replay.unpack $decode_commands__99.invoke 0.1% 7 + 0starcraft.replay.parse $parse_buffer__53$fn__56.invoke 0.1% 6 + 0clojure.lang.AFn.applyToHelper 0.1% 6 + 0clojure.lang.PersistentArrayMap.assoc 0.1% 6 + 0clojure.lang.PersistentHashMap $BitmapIndexedNode.assoc 0.0% 0 + 5java.lang.reflect.Array.newArray 0.0% 0 + 5java.lang.Class.forName0 0.0% 0 + 5java.util.zip.Inflater.inflateBytes 0.0% 5 + 0java.lang.AbstractStringBuilder. 0.0% 5 + 0java.util.Arrays.copyOfRange 10.9% 1157 +76Total interpreted (including elided) Compiled + native Method 10.4% 1183 + 1starcraft.replay.parse$fn__23$fn__49.invoke 10.0% 1123 +17starcraft.replay.unpack $decode_command_block__94.invoke 9.2% 1043 + 0clojure.core$next__3096.invoke 8.9% 1014 + 0starcraft.replay.parse $parse_buffer__53$fn__56.invoke 5.5% 626 + 0clojure.lang.PersistentArrayMap.assoc 4.3% 474 +17clojure.lang.PersistentArrayMap.assoc 4.1% 464 + 7clojure.lang.RestFn.invoke 2.9% 333 + 0clojure.lang.Cons.next 2.5% 288 + 0clojure.lang.RT.seq 2.4% 269 + 0clojure.lang.AFn.applyToHelper 2.2% 249 + 0 hu.belicza.andras.bwhf.control.BinReplayUnpacker.unpackRepChunk 1.8% 202 + 0clojure.core$seq__3112.invoke 1.6% 174 + 3clojure.lang.RestFn.applyTo 1.3% 140 + 2clojure.lang.APersistentMap.cons 1.2% 130 + 1clojure.core$spread__3225.invoke 1.1% 127 + 0clojure.lang.PersistentStructMap.valAt 0.8%93 + 0clojure.core$reduce__3304.invoke 0.6%66 + 2starcraft.replay.unpack $decode_commands__99.invoke 0.6%63 + 0clojure.lang.PersistentArrayMap.valAt 0.1%13 + 1clojure.core$conj__3100.invoke 0.1% 9 + 0clojure.lang.APersistentMap.invoke 0.1% 3 + 6starcraft.replay.parse $fn__23$read_shorts__37.invoke 0.1% 8 + 0clojure.core$nthnext__4405.invoke 0.1% 0 + 7clojure.lang.ArraySeq.next 0.0% 0 + 5clojure.lang.APersistentVector.assoc 72.3% 8126 +76Total compiled (including elided) Stub + native Method 15.1% 0 + 1711java.lang.reflect.Array.setInt 1.2% 0 + 135java.lang.System.arraycopy 0.3% 0 +31java.lang.reflect.Array.set 0.1% 0 +15java.io.FileInputStream.readBytes 0.1% 0 +13java.lang.reflect.Array.get 0.1% 0 + 7java.lang.Object.getClass 0.0% 0 + 1java.lang.Thread.currentThread 16.9% 0 + 1913Total stub Thread-local ticks: 0.0% 1 Class loader 0.0% 2 Unknown: no last frame Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM Thread-local ticks: 100.0% 1 Blocked (of total) Global summary of 176.12 seconds: 100.0% 11603 Received ticks 2.1% 246 Received GC ticks 4.3% 495 Compilation 0.0% 2 Other VM operations 0.0% 1 Class loader 0.0% 2 Unknown code 176.257 secs On Mar 31, 8:57 pm, David Nolen wrote: > Thanks to cl-format: > > (fn > [buf__2572__auto__ len__2573__auto__] > (if (= len__2573__auto__ 1) > (mask8 (. buf__2572__auto__ (get))) > (let [arr__2574__auto__ (int-array len__2573__auto__)] > (
Re: I need help tracking down a performance problem.
Thanks to cl-format: (fn [buf__2572__auto__ len__2573__auto__] (if (= len__2573__auto__ 1) (mask8 (. buf__2572__auto__ (get))) (let [arr__2574__auto__ (int-array len__2573__auto__)] (dotimes [i__2575__auto__ len__2573__auto__] (aset-int arr__2574__auto__ i__2575__auto__ (mask8 (. buf__2572__auto__ (get) arr__2574__auto__))) This is the expansion for (make-reader get mask8), where were you attempting putting the int coercion to to the mask-fn? David On Tue, Mar 31, 2009 at 11:38 AM, Vincent Foley wrote: > > I tried using aset-int and I tried using int to coerce the result of > mask-fn, the input argument to mask-fn and few other things, but none > of that seems to make a difference so far. Mind you, this is an > aspect of Clojure that I find a little confusing, so I'm just putting > int calls here and there and looking at what happens. > > On Mar 31, 10:46 am, Christophe Grand wrote: > > Did you try to coerce the result of (~mask-fn ...) with int? > > (or use aset-int as suggested by David) > > > > On Tue, Mar 31, 2009 at 4:17 PM, Vincent Foley wrote: > > > > > No, but in my defense I did not know such a function existed :) I'll > > > give it a whirl and report back! > > > > > On Mar 31, 9:57 am, David Nolen wrote: > > > > Did you try using aset-int instead of aset? > > > > > > On Tue, Mar 31, 2009 at 8:25 AM, Vincent Foley > wrote: > > > > > > > For those interested, I managed to improve the performance of my > > > > > original program from 2 minutes 40 seconds to decode 1000+ files > down > > > > > to 2 minutes. I'm still far from my goal, but it's an improvement, > > > > > especially since the code is shorter and (IMO) cleaner. You can > see > > > > > it here: > > > > > > http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/rep. > > > .. > > > > > > > And here's another question, running the program with -Xprof shows > > > > > that nearly 20% of my execution time is spent calling > > > > > java.lang.reflect.Array.set. Is there something wrong with the way > I > > > > > type hint my array in make-reader? > > > > > > > Thanks, > > > > > > > Vincent. > > > > > > > On Mar 19, 8:12 pm, Vincent Foley wrote: > > > > > > Hello, > > > > > > > > For the past few days, I've been trying, unsuccessfully, to make > an > > > > > > application I wrote faster. A Java program that performs, more > or > > > > > > less, the same task takes 12 seconds (on my machine) to parse > 1000 > > > > > > files; my Clojure program takes nearly 3 minutes. This more than > an > > > > > > order of magnitude slower! Using the profiling tools available > with > > > > > > the JVM, I quickly determined which function was the costliest. > I > > > > > > copied it into a simple script file to profile it in isolation. > I > > > > > > have made the script and the profile results (long!) available at > > > this > > > > > > URL:http://gist.github.com/82136 > > > > > > > > I'm finding the results puzzling: is dereferencing a var *that* > > > > > > expensive? Can anyone tell me if they see something > fundamentally > > > > > > wrong with my approach that would explain this abysmal > performance? > > > > > > > > Thank you, > > > > > > > > Vincent. > > > > > > > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. > My > > > > > > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. > > > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
On Tue, Mar 31, 2009 at 1:24 PM, Aaron Cohen wrote: > I'm sorry if I missed you mentioning it, but have you tried running > your code with (set! *warn-on-reflection* true) in effect? > Ugh, I should have looked at your code before I sent that. There it is on line 1. ;) -- Aaron --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
I'm sorry if I missed you mentioning it, but have you tried running your code with (set! *warn-on-reflection* true) in effect? -- Aaron --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
I tried using aset-int and I tried using int to coerce the result of mask-fn, the input argument to mask-fn and few other things, but none of that seems to make a difference so far. Mind you, this is an aspect of Clojure that I find a little confusing, so I'm just putting int calls here and there and looking at what happens. On Mar 31, 10:46 am, Christophe Grand wrote: > Did you try to coerce the result of (~mask-fn ...) with int? > (or use aset-int as suggested by David) > > On Tue, Mar 31, 2009 at 4:17 PM, Vincent Foley wrote: > > > No, but in my defense I did not know such a function existed :) I'll > > give it a whirl and report back! > > > On Mar 31, 9:57 am, David Nolen wrote: > > > Did you try using aset-int instead of aset? > > > > On Tue, Mar 31, 2009 at 8:25 AM, Vincent Foley wrote: > > > > > For those interested, I managed to improve the performance of my > > > > original program from 2 minutes 40 seconds to decode 1000+ files down > > > > to 2 minutes. I'm still far from my goal, but it's an improvement, > > > > especially since the code is shorter and (IMO) cleaner. You can see > > > > it here: > > > >http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/rep. > > .. > > > > > And here's another question, running the program with -Xprof shows > > > > that nearly 20% of my execution time is spent calling > > > > java.lang.reflect.Array.set. Is there something wrong with the way I > > > > type hint my array in make-reader? > > > > > Thanks, > > > > > Vincent. > > > > > On Mar 19, 8:12 pm, Vincent Foley wrote: > > > > > Hello, > > > > > > For the past few days, I've been trying, unsuccessfully, to make an > > > > > application I wrote faster. A Java program that performs, more or > > > > > less, the same task takes 12 seconds (on my machine) to parse 1000 > > > > > files; my Clojure program takes nearly 3 minutes. This more than an > > > > > order of magnitude slower! Using the profiling tools available with > > > > > the JVM, I quickly determined which function was the costliest. I > > > > > copied it into a simple script file to profile it in isolation. I > > > > > have made the script and the profile results (long!) available at > > this > > > > > URL:http://gist.github.com/82136 > > > > > > I'm finding the results puzzling: is dereferencing a var *that* > > > > > expensive? Can anyone tell me if they see something fundamentally > > > > > wrong with my approach that would explain this abysmal performance? > > > > > > Thank you, > > > > > > Vincent. > > > > > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > > > > > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Did you try to coerce the result of (~mask-fn ...) with int? (or use aset-int as suggested by David) On Tue, Mar 31, 2009 at 4:17 PM, Vincent Foley wrote: > > No, but in my defense I did not know such a function existed :) I'll > give it a whirl and report back! > > On Mar 31, 9:57 am, David Nolen wrote: > > Did you try using aset-int instead of aset? > > > > On Tue, Mar 31, 2009 at 8:25 AM, Vincent Foley wrote: > > > > > For those interested, I managed to improve the performance of my > > > original program from 2 minutes 40 seconds to decode 1000+ files down > > > to 2 minutes. I'm still far from my goal, but it's an improvement, > > > especially since the code is shorter and (IMO) cleaner. You can see > > > it here: > > >http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/rep. > .. > > > > > And here's another question, running the program with -Xprof shows > > > that nearly 20% of my execution time is spent calling > > > java.lang.reflect.Array.set. Is there something wrong with the way I > > > type hint my array in make-reader? > > > > > Thanks, > > > > > Vincent. > > > > > On Mar 19, 8:12 pm, Vincent Foley wrote: > > > > Hello, > > > > > > For the past few days, I've been trying, unsuccessfully, to make an > > > > application I wrote faster. A Java program that performs, more or > > > > less, the same task takes 12 seconds (on my machine) to parse 1000 > > > > files; my Clojure program takes nearly 3 minutes. This more than an > > > > order of magnitude slower! Using the profiling tools available with > > > > the JVM, I quickly determined which function was the costliest. I > > > > copied it into a simple script file to profile it in isolation. I > > > > have made the script and the profile results (long!) available at > this > > > > URL:http://gist.github.com/82136 > > > > > > I'm finding the results puzzling: is dereferencing a var *that* > > > > expensive? Can anyone tell me if they see something fundamentally > > > > wrong with my approach that would explain this abysmal performance? > > > > > > Thank you, > > > > > > Vincent. > > > > > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > > > > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. > > > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
No, but in my defense I did not know such a function existed :) I'll give it a whirl and report back! On Mar 31, 9:57 am, David Nolen wrote: > Did you try using aset-int instead of aset? > > On Tue, Mar 31, 2009 at 8:25 AM, Vincent Foley wrote: > > > For those interested, I managed to improve the performance of my > > original program from 2 minutes 40 seconds to decode 1000+ files down > > to 2 minutes. I'm still far from my goal, but it's an improvement, > > especially since the code is shorter and (IMO) cleaner. You can see > > it here: > >http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/rep... > > > And here's another question, running the program with -Xprof shows > > that nearly 20% of my execution time is spent calling > > java.lang.reflect.Array.set. Is there something wrong with the way I > > type hint my array in make-reader? > > > Thanks, > > > Vincent. > > > On Mar 19, 8:12 pm, Vincent Foley wrote: > > > Hello, > > > > For the past few days, I've been trying, unsuccessfully, to make an > > > application I wrote faster. A Java program that performs, more or > > > less, the same task takes 12 seconds (on my machine) to parse 1000 > > > files; my Clojure program takes nearly 3 minutes. This more than an > > > order of magnitude slower! Using the profiling tools available with > > > the JVM, I quickly determined which function was the costliest. I > > > copied it into a simple script file to profile it in isolation. I > > > have made the script and the profile results (long!) available at this > > > URL:http://gist.github.com/82136 > > > > I'm finding the results puzzling: is dereferencing a var *that* > > > expensive? Can anyone tell me if they see something fundamentally > > > wrong with my approach that would explain this abysmal performance? > > > > Thank you, > > > > Vincent. > > > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > > > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Did you try using aset-int instead of aset? On Tue, Mar 31, 2009 at 8:25 AM, Vincent Foley wrote: > > For those interested, I managed to improve the performance of my > original program from 2 minutes 40 seconds to decode 1000+ files down > to 2 minutes. I'm still far from my goal, but it's an improvement, > especially since the code is shorter and (IMO) cleaner. You can see > it here: > http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/replay/parse.clj > > And here's another question, running the program with -Xprof shows > that nearly 20% of my execution time is spent calling > java.lang.reflect.Array.set. Is there something wrong with the way I > type hint my array in make-reader? > > Thanks, > > Vincent. > > On Mar 19, 8:12 pm, Vincent Foley wrote: > > Hello, > > > > For the past few days, I've been trying, unsuccessfully, to make an > > application I wrote faster. A Java program that performs, more or > > less, the same task takes 12 seconds (on my machine) to parse 1000 > > files; my Clojure program takes nearly 3 minutes. This more than an > > order of magnitude slower! Using the profiling tools available with > > the JVM, I quickly determined which function was the costliest. I > > copied it into a simple script file to profile it in isolation. I > > have made the script and the profile results (long!) available at this > > URL:http://gist.github.com/82136 > > > > I'm finding the results puzzling: is dereferencing a var *that* > > expensive? Can anyone tell me if they see something fundamentally > > wrong with my approach that would explain this abysmal performance? > > > > Thank you, > > > > Vincent. > > > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
For those interested, I managed to improve the performance of my original program from 2 minutes 40 seconds to decode 1000+ files down to 2 minutes. I'm still far from my goal, but it's an improvement, especially since the code is shorter and (IMO) cleaner. You can see it here: http://bitbucket.org/gnuvince/clj-starcraft/src/tip/src/starcraft/replay/parse.clj And here's another question, running the program with -Xprof shows that nearly 20% of my execution time is spent calling java.lang.reflect.Array.set. Is there something wrong with the way I type hint my array in make-reader? Thanks, Vincent. On Mar 19, 8:12 pm, Vincent Foley wrote: > Hello, > > For the past few days, I've been trying, unsuccessfully, to make an > application I wrote faster. A Java program that performs, more or > less, the same task takes 12 seconds (on my machine) to parse 1000 > files; my Clojure program takes nearly 3 minutes. This more than an > order of magnitude slower! Using the profiling tools available with > the JVM, I quickly determined which function was the costliest. I > copied it into a simple script file to profile it in isolation. I > have made the script and the profile results (long!) available at this > URL:http://gist.github.com/82136 > > I'm finding the results puzzling: is dereferencing a var *that* > expensive? Can anyone tell me if they see something fundamentally > wrong with my approach that would explain this abysmal performance? > > Thank you, > > Vincent. > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Bit-syntax (Was: Re: I need help tracking down a performance problem.)
On Mon, 23 Mar 2009 19:41:07 -0700 Mark Engelberg wrote: > > On Mon, Mar 23, 2009 at 7:31 PM, Vincent Foley > wrote: > > More generally, is it possible that I'm just doing this whole thing > > wrong? That using vectors to represent binary fields and records > > in a declarative is just a bad idea and that I should try and > > explore lower- level alternatives? > > > > > This reminds me... I would LOVE to see in Clojure something like > Erlang's bit-packing/destructuring syntax for working with bit > manipulation in a high-level way. > > http://www.erlang.se/euc/00/bit_syntax.html > I'm actually casually hacking on something like this right now, as I want to attempt to port a radar processing code I wrote into Clojure. (It's currently implemented in modern C++ with a functional-ish style) The WSR-88D file format is *extremely* bizarre and convoluted and requires a lot of bit-fiddling. Half or more of the current code is dedicated to converting this format into something sane. I actually hate that part of my C++ implementation the most because it's all hand-written unpacking and endianness-fixing of byte arrays which seems impossible to clean up any further. -So- .. What I actually have so far is a macro that lets you basically write Erlang bit-syntax in S-exps and it will expand into a vector that you can use as the binding list of a let form. The IP header unpacking example in Erlang's documentation looks something like this: (let (bits dgram [ip-ver 4] [hdr-len 4] svc-type [tot-len 16] [id 16] [flags 3] [frag-off 13] ttl proto [hdr-cksum 16] [src-ip 32] [dst-ip 32] [rest-dgram :binary]) 'stuff) Not shown: floats, signedness, and endianness (defaults are all the same as Erlang). It's "okay", but it looks unnatural and I don't really like doing it that way very much. The source and destination names are backwards (dgram is being destructured into ip-ver, hdr-len and so on), and the syntax hijacks the whole let form, because I don't think you can write a macro that automatically splices into its parent form. So I had been thinking of composing an email about what the likelihood of seeing extensibility API for destructuring would be. The idea being that the UI for the bit-syntax library would allow for this custom destructuring to be pervasive, look a little more conventional, and let you mix it with regular binding forms (instead of having to use an explicit let that is completely hijacked by the bit-syntax). You could of course write something that would unpack bits into a vector, and then destructure the vector on the left-hand side, but that separates the names which will be bound (on the left) from their field specifiers (passed to the macro on the right). I found this difficult to read when the specifier list gets large, so I implemented it this way so the bound names are next to their specifications (just like Erlang's). Details: the macro expands into a bunch of sequential let bindings that progressively tease apart the blob using some helper functions. I haven't actually implemented the functions that do this teasing yet because that requires me to go digging in Java API docs to see what they provide for dealing with bit-bashing, and I haven't had the inclination to do that just yet :) So, is there any chance of extensible destructuring? What would such an API look like? I have thought about it a lot, but the minutiae of doing this generally enough to be useful, but simple enough to be implementable are probably beyond my grasp. -Kyle --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
On Mon, Mar 23, 2009 at 7:31 PM, Vincent Foley wrote: > More generally, is it possible that I'm just doing this whole thing > wrong? That using vectors to represent binary fields and records in a > declarative is just a bad idea and that I should try and explore lower- > level alternatives? > This reminds me... I would LOVE to see in Clojure something like Erlang's bit-packing/destructuring syntax for working with bit manipulation in a high-level way. http://www.erlang.se/euc/00/bit_syntax.html --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
I can try that. More generally, is it possible that I'm just doing this whole thing wrong? That using vectors to represent binary fields and records in a declarative is just a bad idea and that I should try and explore lower- level alternatives? Vincent On Mar 23, 3:35 pm, Christophe Grand wrote: > Hi Vincent! > > Vincent Foley a écrit : > > > Using the new versions of null-string and read-field-aux that you gave > > me, in my real application, the execution time went from 160 seconds > > to 150 seconds. As for using macros, I wrote one for the example > > program, but I realized that it wouldn't work in my application, > > because I sometimes use (apply parse-buffer buf field-vector). > > I looked at your program, you can use macros there too. > > Given compile-fields a function that outputs the code of a closure > taking a ByteBuffer as only argument then you can rewrite action as a > macro like this: > > (defmacro action > [name & v-forms] > {:name name > :fields (compile-fields v-forms)}) and replace (apply parse-buffer > buf fields) with (fields buf). > > Do you think it's feasible? > > -- > Professional:http://cgrand.net/(fr) > On Clojure:http://clj-me.blogspot.com/(en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Hi Vincent! Vincent Foley a écrit : > Using the new versions of null-string and read-field-aux that you gave > me, in my real application, the execution time went from 160 seconds > to 150 seconds. As for using macros, I wrote one for the example > program, but I realized that it wouldn't work in my application, > because I sometimes use (apply parse-buffer buf field-vector). > I looked at your program, you can use macros there too. Given compile-fields a function that outputs the code of a closure taking a ByteBuffer as only argument then you can rewrite action as a macro like this: (defmacro action [name & v-forms] {:name name :fields (compile-fields v-forms)}) and replace (apply parse-buffer buf fields) with (fields buf). Do you think it's feasible? -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Using the new versions of null-string and read-field-aux that you gave me, in my real application, the execution time went from 160 seconds to 150 seconds. As for using macros, I wrote one for the example program, but I realized that it wouldn't work in my application, because I sometimes use (apply parse-buffer buf field-vector). Thanks for the assistance, Vincent. On Mar 22, 4:33 pm, Vincent Foley wrote: > How would I do that? Make a macro that expands into a map literal > with the appropriate calls to .get, .getShort and .getInt? > > On Mar 22, 4:20 pm, Christophe Grand wrote: > > > Vincent Foley a écrit :> The code is available at this > > URL:http://code.google.com/p/bwhf/ > > > (look at the hu.becliza.andras.bwhf.control.[BinRepParser, > > > BinReplayUnpacker] files) > > > > It is definitely not as dynamic, which helps quite a lot, but I wanted > > > to have something high level and declarative. > > > If you want to go high-level and declarative you can certainly implement > > this as a macro which will compile your declaration into fast static code. > > > -- > > Professional:http://cgrand.net/(fr) > > On Clojure:http://clj-me.blogspot.com/(en) > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
How would I do that? Make a macro that expands into a map literal with the appropriate calls to .get, .getShort and .getInt? On Mar 22, 4:20 pm, Christophe Grand wrote: > Vincent Foley a écrit :> The code is available at this > URL:http://code.google.com/p/bwhf/ > > (look at the hu.becliza.andras.bwhf.control.[BinRepParser, > > BinReplayUnpacker] files) > > > It is definitely not as dynamic, which helps quite a lot, but I wanted > > to have something high level and declarative. > > If you want to go high-level and declarative you can certainly implement > this as a macro which will compile your declaration into fast static code. > > -- > Professional:http://cgrand.net/(fr) > On Clojure:http://clj-me.blogspot.com/(en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Vincent Foley a écrit : > The code is available at this URL: http://code.google.com/p/bwhf/ > (look at the hu.becliza.andras.bwhf.control.[BinRepParser, > BinReplayUnpacker] files) > > It is definitely not as dynamic, which helps quite a lot, but I wanted > to have something high level and declarative. > If you want to go high-level and declarative you can certainly implement this as a macro which will compile your declaration into fast static code. -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
The code is available at this URL: http://code.google.com/p/bwhf/ (look at the hu.becliza.andras.bwhf.control.[BinRepParser, BinReplayUnpacker] files) It is definitely not as dynamic, which helps quite a lot, but I wanted to have something high level and declarative. Thank you for your help, I'll look into the improvements you suggested. Vincent. On Mar 21, 9:40 am, Christophe Grand wrote: > Hello, > > Is the java code online somewhere to compare? (Since your code is an > interpreter for byte-fields specs there's a lot of stuff going to read a > single byte, I wonder if the java code you're benchmarking against is > that dynamic) > > Right now, null-string and read-field-aux are the most time consuming > functions. > > Creating a cons for each byte read is too much overhead, I rewrote > null-string like this (yes it's lower level): > (defn null-string > "Read a nul-terminated string. Stop at \\0 or at length n > whichever comes first." > [#^ByteBuffer buf n] > (let [arr #^"[B" (make-array Byte/TYPE n) > sb (StringBuilder.)] > (.get buf arr) > (areduce arr i append true > (when append > (let [b (aget arr i)] > (when-not (zero? (int b)) > (.append sb (char b > > Concerning read-field-aux I rewrote it like that: > (def read-field-aux > (let [getters {Byte get-byte > Short get-short > Integer get-integer}] > (fn [#^ByteBuffer buf n type] > (let [f (getters type)] > (if (== (int n) 1) > (f buf) > (let [arr (make-array Object n)] > (dotimes [i n] > (aset arr i (f buf))) > (vec arr))) > > closing over getters (rather than rebuilding it inside the closure) > yielded nearly as much as going "lower level". > > With these two changes, it's the dispatch fn that now dominates. > > Vincent Foley a écrit : > > > > > Here:http://gist.github.com/82352 > > > I have posted memory and cpu profiling figures. > > > On Mar 20, 6:56 am, Christophe Grand wrote: > > >> Hello Vincent, > > >> Vincent Foley a écrit : > > >>> Hello, > > >>> For the past few days, I've been trying, unsuccessfully, to make an > >>> application I wrote faster. A Java program that performs, more or > >>> less, the same task takes 12 seconds (on my machine) to parse 1000 > >>> files; my Clojure program takes nearly 3 minutes. This more than an > >>> order of magnitude slower! Using the profiling tools available with > >>> the JVM, I quickly determined which function was the costliest. I > >>> copied it into a simple script file to profile it in isolation. I > >>> have made the script and the profile results (long!) available at this > >>> URL:http://gist.github.com/82136 > > >> On my box nearly one half of the total time is sepent in the building of > >> arr (and by looking at your profiling data a good chunk of these traces > >> are related to this map call). > >> Better to get it out of the loop. > > >> (let [arr (into-array Byte/TYPE (map byte [7 ; 1 Byte > >> 3 0 97 0 98 0 99 ; 3 Shorts > >> 0 0 100 100 ; 1 Integer > >> 65 66 67 68 69 70 71 72 73 74 > >> ; 10 String > >> 0 0 0 0 0 0 0 0 0 0 ; 10 ignored > >> ])) > >> buf (ByteBuffer/wrap arr)] > >> (time > >> (dotimes [_ 1] > >> (.position buf 0) > >> (run buf > > >> Can you give the profile results for this code? > > >> Christophe > > >> -- > >> Professional:http://cgrand.net/(fr) > >> On Clojure:http://clj-me.blogspot.com/(en) > > -- > Professional:http://cgrand.net/(fr) > On Clojure:http://clj-me.blogspot.com/(en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Christophe Grand a écrit : > With these two changes, it's the dispatch fn that now dominates. > Correction: it's the dispatching, not the computation of the dispatch value that dominates. -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Hello, Is the java code online somewhere to compare? (Since your code is an interpreter for byte-fields specs there's a lot of stuff going to read a single byte, I wonder if the java code you're benchmarking against is that dynamic) Right now, null-string and read-field-aux are the most time consuming functions. Creating a cons for each byte read is too much overhead, I rewrote null-string like this (yes it's lower level): (defn null-string "Read a nul-terminated string. Stop at \\0 or at length n whichever comes first." [#^ByteBuffer buf n] (let [arr #^"[B" (make-array Byte/TYPE n) sb (StringBuilder.)] (.get buf arr) (areduce arr i append true (when append (let [b (aget arr i)] (when-not (zero? (int b)) (.append sb (char b Concerning read-field-aux I rewrote it like that: (def read-field-aux (let [getters {Byte get-byte Short get-short Integer get-integer}] (fn [#^ByteBuffer buf n type] (let [f (getters type)] (if (== (int n) 1) (f buf) (let [arr (make-array Object n)] (dotimes [i n] (aset arr i (f buf))) (vec arr))) closing over getters (rather than rebuilding it inside the closure) yielded nearly as much as going "lower level". With these two changes, it's the dispatch fn that now dominates. Vincent Foley a écrit : > Here: http://gist.github.com/82352 > > I have posted memory and cpu profiling figures. > > On Mar 20, 6:56 am, Christophe Grand wrote: > >> Hello Vincent, >> >> Vincent Foley a écrit : >> >> >>> Hello, >>> >>> For the past few days, I've been trying, unsuccessfully, to make an >>> application I wrote faster. A Java program that performs, more or >>> less, the same task takes 12 seconds (on my machine) to parse 1000 >>> files; my Clojure program takes nearly 3 minutes. This more than an >>> order of magnitude slower! Using the profiling tools available with >>> the JVM, I quickly determined which function was the costliest. I >>> copied it into a simple script file to profile it in isolation. I >>> have made the script and the profile results (long!) available at this >>> URL:http://gist.github.com/82136 >>> >> On my box nearly one half of the total time is sepent in the building of >> arr (and by looking at your profiling data a good chunk of these traces >> are related to this map call). >> Better to get it out of the loop. >> >> (let [arr (into-array Byte/TYPE (map byte [7 ; 1 Byte >>3 0 97 0 98 0 99 ; 3 Shorts >>0 0 100 100 ; 1 Integer >>65 66 67 68 69 70 71 72 73 74 >> ; 10 String >>0 0 0 0 0 0 0 0 0 0 ; 10 ignored >>])) >> buf (ByteBuffer/wrap arr)] >> (time >> (dotimes [_ 1] >> (.position buf 0) >> (run buf >> >> Can you give the profile results for this code? >> >> Christophe >> >> -- >> Professional:http://cgrand.net/(fr) >> On Clojure:http://clj-me.blogspot.com/(en) >> > > > > -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Here: http://gist.github.com/82352 I have posted memory and cpu profiling figures. On Mar 20, 6:56 am, Christophe Grand wrote: > Hello Vincent, > > Vincent Foley a écrit : > > > Hello, > > > For the past few days, I've been trying, unsuccessfully, to make an > > application I wrote faster. A Java program that performs, more or > > less, the same task takes 12 seconds (on my machine) to parse 1000 > > files; my Clojure program takes nearly 3 minutes. This more than an > > order of magnitude slower! Using the profiling tools available with > > the JVM, I quickly determined which function was the costliest. I > > copied it into a simple script file to profile it in isolation. I > > have made the script and the profile results (long!) available at this > > URL:http://gist.github.com/82136 > > On my box nearly one half of the total time is sepent in the building of > arr (and by looking at your profiling data a good chunk of these traces > are related to this map call). > Better to get it out of the loop. > > (let [arr (into-array Byte/TYPE (map byte [7 ; 1 Byte > 3 0 97 0 98 0 99 ; 3 Shorts > 0 0 100 100 ; 1 Integer > 65 66 67 68 69 70 71 72 73 74 > ; 10 String > 0 0 0 0 0 0 0 0 0 0 ; 10 ignored > ])) > buf (ByteBuffer/wrap arr)] > (time > (dotimes [_ 1] > (.position buf 0) > (run buf > > Can you give the profile results for this code? > > Christophe > > -- > Professional:http://cgrand.net/(fr) > On Clojure:http://clj-me.blogspot.com/(en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Hello Vincent, Vincent Foley a écrit : > Hello, > > For the past few days, I've been trying, unsuccessfully, to make an > application I wrote faster. A Java program that performs, more or > less, the same task takes 12 seconds (on my machine) to parse 1000 > files; my Clojure program takes nearly 3 minutes. This more than an > order of magnitude slower! Using the profiling tools available with > the JVM, I quickly determined which function was the costliest. I > copied it into a simple script file to profile it in isolation. I > have made the script and the profile results (long!) available at this > URL: http://gist.github.com/82136 > On my box nearly one half of the total time is sepent in the building of arr (and by looking at your profiling data a good chunk of these traces are related to this map call). Better to get it out of the loop. (let [arr (into-array Byte/TYPE (map byte [7 ; 1 Byte 3 0 97 0 98 0 99 ; 3 Shorts 0 0 100 100 ; 1 Integer 65 66 67 68 69 70 71 72 73 74 ; 10 String 0 0 0 0 0 0 0 0 0 0 ; 10 ignored ])) buf (ByteBuffer/wrap arr)] (time (dotimes [_ 1] (.position buf 0) (run buf Can you give the profile results for this code? Christophe -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---
Re: I need help tracking down a performance problem.
Hi Mr Foley, Running your code on my machine give the following result: (load-file "slow.clj") "Elapsed time: 1736.034068 msecs" nil user> (load-file "slow.clj") "Elapsed time: 678.850779 msecs" nil I'm on Ubuntu 8.10 on a Hp dv6700 with 2gig RAM. Stephane On Mar 19, 8:12 pm, Vincent Foley wrote: > Hello, > > For the past few days, I've been trying, unsuccessfully, to make an > application I wrote faster. A Java program that performs, more or > less, the same task takes 12 seconds (on my machine) to parse 1000 > files; my Clojure program takes nearly 3 minutes. This more than an > order of magnitude slower! Using the profiling tools available with > the JVM, I quickly determined which function was the costliest. I > copied it into a simple script file to profile it in isolation. I > have made the script and the profile results (long!) available at this > URL:http://gist.github.com/82136 > > I'm finding the results puzzling: is dereferencing a var *that* > expensive? Can anyone tell me if they see something fundamentally > wrong with my approach that would explain this abysmal performance? > > Thank you, > > Vincent. > > P.S.: I am using Sun's JVM 1.6.0_10 as shipped in Ubuntu Ibex. My > machine is an Athlon 64 X2 4200+ with 3 GB of RAM. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~--~~~~--~~--~--~---