Hey Michael,
Since your eval solution essentially "cookie-cutters" out maps, each with
the same keys, as fast as it can, I was playing around with what would
happen if you used records, and I cobbled together something that appears
to run twice as fast as the eval approach:
(defn read-to-structs [rows]
(let [headers (->>
rows
first
(take-while (complement #{""}))
(map keyword))
s (apply create-struct headers)]
(for [row (rest rows)]
(apply struct s row))))
Here are comparative timings:
(time-fn read-to-maps)
"Elapsed time: 4871.02175 msecs"
=> nil
(time-fn read-to-maps-partial)
"Elapsed time: 4814.730643 msecs"
=> nil
(time-fn read-to-maps-fn)
"Elapsed time: 4815.230087 msecs"
=> nil
(time-fn read-to-maps-eval)
"Elapsed time: 2466.048578 msecs"
=> nil
(time-fn read-to-structs)
"Elapsed time: 1273.462618 msecs"
I didn't test it too much, but it passed this:
(= (read-to-maps csv-fix) (read-to-structs csv-fix))
=> true
On Friday, October 10, 2014 4:21:01 PM UTC-4, Michael Blume wrote:
>
> https://github.com/MichaelBlume/eval-speed
>
> eval-speed.core=> (time-fn read-to-maps)
> "Elapsed time: 5551.011069 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-fn)
> "Elapsed time: 5587.256991 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-partial)
> "Elapsed time: 5606.649172 msecs"
> nil
> eval-speed.core=> (time-fn read-to-maps-eval)
> "Elapsed time: 2627.521592 msecs"
> nil
>
> Ben, I'd still like to understand exactly what work the CPU is doing in
> the uneval'd version that it's skipping in the eval'd version. It seems
> like in the generated bytecode there's going to be *some* concept of
> iterating through the row in either case, if only as part of the
> destructuring process.
>
>
> On Friday, October 10, 2014 1:07:08 PM UTC-7, Ben wrote:
>>
>> I believe it's because the `mapper` function is just creating and
>> returning a map literal. The "mapper" function in the evaled version is
>> something like this:
>>
>> user> (def names '[n1 n2 n3 n4])
>> #'user/names
>> user> (def headers '[h1 h2 h3 h4])
>> #'user/headers
>> user> `(fn [[~@names]] ~(zipmap headers names))
>> (clojure.core/fn [[n1 n2 n3 n4]] {h4 n4, h3 n3, h2 n2, h1 n1}) ;; just
>> a map literal, whose keys are already known.
>>
>> Whereas in the first version, zipmap has to be called, iterating over
>> headers and names each time.
>>
>> On Fri, Oct 10, 2014 at 1:04 PM, Sean Corfield <[email protected]>
>> wrote:
>>
>>> It may be more to do with the difference between `for` and `map`. How do
>>> these versions compare in your benchmark:
>>>
>>> (defn read-to-maps-partial [rows]
>>> (let [headers (->>
>>> rows
>>> first
>>> (take-while (complement #{""}))
>>> (map keyword))]
>>> (map (partial zipmap headers) (rest rows))))
>>>
>>> (defn read-to-maps-fn [rows]
>>> (let [headers (->>
>>> rows
>>> first
>>> (take-while (complement #{""}))
>>> (map keyword))
>>> mapper (fn [row] (zipmap headers row))]
>>> (map mapper (rest rows))))
>>>
>>> Sean
>>>
>>> On Oct 10, 2014, at 11:42 AM, Michael Blume <[email protected]> wrote:
>>> > So I'm reading a bunch of rows from a huge csv file and marshalling
>>> those rows into maps using the first row as keys. I wrote the function two
>>> ways: https://gist.github.com/MichaelBlume/c67d22df0ff9c225d956 and the
>>> version with eval is twice as fast and I'm kind of curious about why.
>>> Presumably the eval'd function still implicitly contains a list of keys,
>>> it's still implicitly treating each row as a seq and walking it, so I'm
>>> wondering what the seq-destructuring and the map literal are doing under
>>> the hood that's faster.
>>>
>>>
>>
>>
>> --
>> Ben Wolfson
>> "Human kind has used its intelligence to vary the flavour of drinks,
>> which may be sweet, aromatic, fermented or spirit-based. ... Family and
>> social life also offer numerous other occasions to consume drinks for
>> pleasure." [Larousse, "Drink" entry]
>>
>>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.