The one thing I'm aware of holding on to is a filtered file-seq: (def the-files (filter #(s/ends-with? (.getName %) ".xml" ) (rest (file-seq (io/file dw-path))))) There are 7,000+ files; but I'm assuming the elements there are just file-references and shouldn't take much space.
The rest of the process is a transducer sequence: (def requirement-seq (sequence (comp (map xml-zip-from-file) (remove degree-complete?) (map student-and-requirements)) the-files)) Those functions are admittedly space inefficient (lots of work with zippers); but are pure. What comes out the other end is a sequence of Clojure maps. Could holding on to the file references prevent all that processing effluvia from being collected? The original files add up to 1.3 gigs altogether. I'd expect the gleaned data to be significantly smaller; but I'd better check into how close that's getting to the default heap-size. Best, Nathan On Tuesday, August 8, 2017 at 1:20:21 AM UTC-7, Peter Hull wrote: > > > On Tuesday, 8 August 2017 06:20:56 UTC+1, Nathan Smutz wrote: > >> Does this message sometimes present because the non-garbage data is >> getting too big? >> > Yes, it's when most of your heap is non-garbage, so the GC has to keep > running but doesn't succeed in freeing much memory each time. > See > > > <> > > You can increases the heap but that might only defer the problem. > > As you process all your files, are you holding on to references to objects > that you don't need any more? > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to For more options, visit this group at --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to For more options, visit