@Nathan the top-level (def requirement-seq ..) is probably the thing
holding on to all the objects.  Try removing the def and calling (last
(sequence (comp ..))) and see if it returns?  The purpose of a lazy
sequence is to allow processing to happen one item or chunk at a time, if
there are still problems, then maybe each element is too big, but that
top-level def is definitely a no-no.  I don't think transducers are
relevant here and you'd get the same problem with normal map/remove calls.

On Tue, Aug 8, 2017 at 12:19 PM Nathan Smutz <nsm...@gmail.com> wrote:

> The one thing I'm aware of holding on to is a filtered file-seq:
> (def the-files (filter #(s/ends-with? (.getName %) ".xml" ) (rest
> (file-seq (io/file dw-path)))))
> There are 7,000+ files; but I'm assuming the elements there are just
> file-references and shouldn't take much space.
>
> The rest of the process is a transducer sequence:
> (def requirement-seq (sequence
>                          (comp
>                            (map xml-zip-from-file)
>                            (remove degree-complete?)
>                            (map student-and-requirements))
>                          the-files))
>
> Those functions are admittedly space inefficient (lots of work with
> zippers); but are pure.  What comes out the other end is a sequence of
> Clojure maps.  Could holding on to the file references prevent all that
> processing effluvia from being collected?
>
> The original files add up to 1.3 gigs altogether.  I'd expect the gleaned
> data to be significantly smaller; but I'd better check into how close
> that's getting to the default heap-size.
>
> Best,
> Nathan
>
> On Tuesday, August 8, 2017 at 1:20:21 AM UTC-7, Peter Hull wrote:
>>
>>
>> On Tuesday, 8 August 2017 06:20:56 UTC+1, Nathan Smutz wrote:
>>
>>> Does this message sometimes present because the non-garbage data is
>>> getting too big?
>>>
>> Yes, it's when most of your heap is non-garbage, so the GC has to keep
>> running but doesn't succeed in freeing much memory each time.
>> See
>> https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/memleaks002.html
>> <https://www.google.com/url?q=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Ftechnotes%2Fguides%2Ftroubleshoot%2Fmemleaks002.html&sa=D&sntz=1&usg=AFQjCNG_3-bT-oubFsBYZ7opNG51ndT1jQ>
>>
>> You can increases the heap but that might only defer the problem.
>>
>> As you process all your files, are you holding on to references to
>> objects that you don't need any more?
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to