If you have the memory you could just increase the heap size to a higher value (like -Xmx2048m or something). But even if you do that I would still run jconsole to see what's happening.
~Adam~ On Fri, Oct 12, 2012 at 9:41 AM, Jim foo.bar <jimpil1...@gmail.com> wrote: > No i haven't profiled memory , only cpu but what you're saying makes > perfect sense. In every single iteration (each file) I'm 'slurp'-ing the 2 > files (the dictionary and the file to annotate) provided. Would you > suggest a different GC if that is the case or simply stop slurping? > > Jim > > > On 12/10/12 15:28, Adam wrote: > > Have you tried running jconsole to monitor the memory usage? It sounds > like maybe you're running out of heap space and you're mainly seeing the > garbage collector doing it's thing vs your actual program. > > ~Adam~ > > > On Fri, Oct 12, 2012 at 9:23 AM, Jim foo.bar <jimpil1...@gmail.com> wrote: > >> Hi all, >> >> I finally found an ideal use-case for pmap, however something very >> strange seems to be happening after roughly 30 minutes of execution! >> >> Ok so here is the scenario: >> >> I've got 383 raw scienific papers (.txt) in directory that i'm grouping >> using 'file-seq' and so I want to pmap a fn on each element of that seq >> (each document). The fn takes a document and a dictionary and annotates the >> document with terms found in the dictionary. Basically it uses regex to tag >> any occurrences of words that exist in the dictionary. When pmapping is >> finished, I should have a list of (annotated) strings that will be >> processed serially (doseq) in order to produce a massive file with all >> these strings separated by a new-line character (this is how most adaptive >> feature generators expect the data to be). >> >> So you can see, this is perfect for pmap and indeed it seems to be doing >> extremely well but only for the first 240 papers roughly! all the cpus are >> working hard but after approximately 30-40 min cpu utilisation and overall >> performance seems to degrade quite a bit...For some strange reason, 2 of my >> cores seem to refuse to do any work after these 240 papers which results in >> a really really slow process. When I start the process it is going so fast >> that I cannot even read the output but as I said after 30-40 min it is >> getting unbelievably slow! Had the performance been stable I reckon I need >> less than 60 min in order to annotate all 383 papers but with the current >> behaviour I have no choice but to abort and restart it passing it the >> leftovers... >> >> any ideas? are there any issues involved with creating that many futures? >> >> Jim >> >> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en