On my laptop (Mac) the biggest difference here has nothing to do with buffering 
in slurp. It is whether you use System/in (fast) or *in* (slow). The latter is 
a LineNumberingPushbackReader.

Can you check and confirm? When I slurp System/in it is more than twice as fast 
as slurping *in*.

I believe the next-biggest perf issue is how StringBuilders grow. I suspect 
that the 4096 buffer is making them grow more efficiently.

Stu

> Another example. I'm running this on a Ubuntu 10.04 laptop with this
> java:
> 
> java version "1.6.0_18"
> OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1)
> OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
> 
> and this command line:
> java -Xmx3G -server clojure.main cat2.clj
> 
> 
> (require '[clojure.java.io :as jio])
> 
> (defn- normalize-slurp-opts
>  [opts]
>  (if (string? (first opts))
>    (do
>      (println "WARNING: (slurp f enc) is deprecated, use (slurp
> f :encoding enc).")
>      [:encoding (first opts)])
>    opts))
> 
> (defn slurp2
>  "Reads the file named by f using the encoding enc into a
> string
>  and returns it."
>  {:added "1.0"}
>  ([f & opts]
>     (let [opts (normalize-slurp-opts opts)
>           data (StringBuffer.)
>           buffer (char-array 4096)]
>       (with-open [#^java.io.Reader r (apply jio/reader f opts)]
>         (loop [c (.read r buffer)]
>           (if (neg? c)
>             (str data)
>             (do
>               (.append data buffer 0 c)
>               (recur (.read r buffer)))))))))
> 
> (time
> (with-open [f (java.io.FileReader. "words")]
>   (println (count (slurp f)))))
> 
> (time
> (with-open [f (java.io.FileReader. "words")]
>   (println (count (slurp2 f)))))
> 
> I get this output:
> 
> $ java -Xmx3G -server clojure.main cat2.clj
> 279440100
> "Elapsed time: 17094.007487 msecs"
> 279440100
> "Elapsed time: 5233.097287 msecs"
> 
> So at least in my environment there seems to be a big difference
> between slurp2 with an explicit buffer and the core/slurp one, which
> appears to be reading a character at a time from a BufferedReader
> stream.
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to