Very interesting thread!

I'm having a similar problem and Michaels approach got me a small speedup 
(around 15%). When trying Mike's approach on my real data, if fails with
IllegalArgumentException Too many arguments to struct constructor  clojure.
lang.PersistentStructMap.construct (PersistentStructMap.java:77)

Op zondag 12 oktober 2014 03:51:46 UTC+2 schreef Mike Fikes:
>
> Hey Michael,
>
> Since your eval solution essentially "cookie-cutters" out maps, each with 
> the same keys, as fast as it can, I was playing around with what would 
> happen if you used records, and I cobbled together something that appears 
> to run twice as fast as the eval approach:
>
> (defn read-to-structs [rows]
>   (let [headers (->>
>                   rows
>                   first
>                   (take-while (complement #{""}))
>                   (map keyword))
>         s (apply create-struct headers)]
>     (for [row (rest rows)]
>       (apply struct s row))))
>
>
> Here are comparative timings:
>
> (time-fn read-to-maps)
> "Elapsed time: 4871.02175 msecs"
> => nil
> (time-fn read-to-maps-partial)
> "Elapsed time: 4814.730643 msecs"
> => nil
> (time-fn read-to-maps-fn)
> "Elapsed time: 4815.230087 msecs"
> => nil
> (time-fn read-to-maps-eval)
> "Elapsed time: 2466.048578 msecs"
> => nil
> (time-fn read-to-structs)
> "Elapsed time: 1273.462618 msecs"
>
> I didn't test it too much, but it passed this:
>
> (= (read-to-maps csv-fix) (read-to-structs csv-fix))
> => true
>
> On Friday, October 10, 2014 4:21:01 PM UTC-4, Michael Blume wrote:
>>
>> https://github.com/MichaelBlume/eval-speed
>>
>> eval-speed.core=> (time-fn read-to-maps)
>> "Elapsed time: 5551.011069 msecs"
>> nil
>> eval-speed.core=> (time-fn read-to-maps-fn)
>> "Elapsed time: 5587.256991 msecs"
>> nil
>> eval-speed.core=> (time-fn read-to-maps-partial)
>> "Elapsed time: 5606.649172 msecs"
>> nil
>> eval-speed.core=> (time-fn read-to-maps-eval)
>> "Elapsed time: 2627.521592 msecs"
>> nil
>>
>> Ben, I'd still like to understand exactly what work the CPU is doing in 
>> the uneval'd version that it's skipping in the eval'd version. It seems 
>> like in the generated bytecode there's going to be *some* concept of 
>> iterating through the row in either case, if only as part of the 
>> destructuring process.
>>
>>
>> On Friday, October 10, 2014 1:07:08 PM UTC-7, Ben wrote:
>>>
>>> I believe it's because the `mapper` function is just creating and 
>>> returning a map literal. The "mapper" function in the evaled version is 
>>> something like this:
>>>
>>> user> (def names '[n1 n2 n3 n4])
>>> #'user/names
>>> user> (def headers '[h1 h2 h3 h4])
>>> #'user/headers
>>> user> `(fn [[~@names]] ~(zipmap headers names))
>>> (clojure.core/fn [[n1 n2 n3 n4]] {h4 n4, h3 n3, h2 n2, h1 n1})   ;; just 
>>> a map literal, whose keys are already known.
>>>
>>> Whereas in the first version, zipmap has to be called, iterating over 
>>> headers and names each time.
>>>
>>> On Fri, Oct 10, 2014 at 1:04 PM, Sean Corfield <se...@corfield.org> 
>>> wrote:
>>>
>>>> It may be more to do with the difference between `for` and `map`. How 
>>>> do these versions compare in your benchmark:
>>>>
>>>> (defn read-to-maps-partial [rows]
>>>>   (let [headers (->>
>>>>                   rows
>>>>                   first
>>>>                   (take-while (complement #{""}))
>>>>                   (map keyword))]
>>>>     (map (partial zipmap headers) (rest rows))))
>>>>
>>>> (defn read-to-maps-fn [rows]
>>>>   (let [headers (->>
>>>>                   rows
>>>>                   first
>>>>                   (take-while (complement #{""}))
>>>>                   (map keyword))
>>>>         mapper (fn [row] (zipmap headers row))]
>>>>     (map mapper (rest rows))))
>>>>
>>>> Sean
>>>>
>>>> On Oct 10, 2014, at 11:42 AM, Michael Blume <blume...@gmail.com> wrote:
>>>> > So I'm reading a bunch of rows from a huge csv file and marshalling 
>>>> those rows into maps using the first row as keys. I wrote the function two 
>>>> ways: https://gist.github.com/MichaelBlume/c67d22df0ff9c225d956 and 
>>>> the version with eval is twice as fast and I'm kind of curious about why. 
>>>> Presumably the eval'd function still implicitly contains a list of keys, 
>>>> it's still implicitly treating each row as a seq and walking it, so I'm 
>>>> wondering what the seq-destructuring and the map literal are doing under 
>>>> the hood that's faster.
>>>>
>>>>
>>>
>>>
>>> -- 
>>> Ben Wolfson
>>> "Human kind has used its intelligence to vary the flavour of drinks, 
>>> which may be sweet, aromatic, fermented or spirit-based. ... Family and 
>>> social life also offer numerous other occasions to consume drinks for 
>>> pleasure." [Larousse, "Drink" entry]
>>>
>>>  

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to