Hi,

I have the following code that I'm trying to use to split a sequence
of strings into files of approx. x-size

(defn split-file
  ([path strs split-size]
     (loop [ss (seq strs), part 0]
       (when-let [more (split-file path ss split-size part)]
         (recur more (inc part)))))
  ([path strs split-size part]
     (with-open [stream (clojure.java.io/output-stream (str path "."
part))]
       (loop [size 0, ss strs]
         (when ss
           (if (< split-size size)
             ss
             (let [bs (.getBytes (first ss))]
               (.write stream bs)
               (recur (+ size (alength bs)) (next ss)))))))))

When I call it like:

(with-open [rdr (clojure.java.io/reader "big-file")]
                 (dorun (bw/split-file "foo" (line-seq rdr) (* 300
1024 1024) 0)))

that is to get the first 500 MB split, it runs fine.  However, when I
run it like:

(with-open [rdr (clojure.java.io/reader "big-file")]
                 (dorun (bw/split-file "foo" (line-seq rdr) (* 300
1024 1024))))

that is when I call the 4 parameter version of split-file, it uses
much more memory and runs much slower, even while working on the first
split.  When the code eventually gets to the second split, memory
usage goes back down.

So I'm convinced that:

  ([path strs split-size]
     (loop [ss (seq strs), part 0]
       (when-let [more (split-file path ss split-size part)]
         (recur more (inc part)))))

is holding onto the head, but I thought the clojure (I'm using version
1.3.0) optimizes that out?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to