Hi, 2009/8/10 Tom Emerson <tremer...@gmail.com>
> > Hello Clojurians, > > I want to process approximately 74K XML files that are stored on disk > in a series of nested directories, each of which contains upto 1000 > files. For example, > > rootdir > 0 > file1.xml > file2.xml > 1 > file3.xml > file4.xml > > and so on. > > file-seq gives me a convenient way to get a seq of all these files. > What I would like to do is process elements in this sequence in > parallel. My first thought was to process the seq with pmap, but this > is suboptimal because I'm not interested in saving the return value of > function called on each file. > > Assuming I want bounded parallelism (such as pmap gives you, 2 + > number of cores) how would you approach this problem in Clojure? > > Thanks in advance for your insights, > 2 ideas : 1./ why not use pmap anyway, in combination with dorun (which will ensure you have consumed the sequence, without retaining the head) ? Ok, solution 1./ creates a lot of unininteresting seqs, so maybe 2./ use clojure.parallel/preduce ? If your fn is (defn process-file [file] ...) (preduce (fn [_ file] (process-file file)) nil files-seq) ? (*preduce* f base coll) > > -tree > > > -- > Tom Emerson > tremer...@gmail.com > http://treerex.blogspot.com/ > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---