I have a few doubts how to approach updating values inside pmap and a general 
question if map is ideal to solve the problem. I'd really appreciate to get a 
review of the following code.
The file-seqC and file-seqB is slightly modified version of file-seq. The plan 
was to spawn subfolders of the root folder into concurrent jobs, count files 
and output final total number. In my initial tests (file-seqB) I was using atom 
which gets updated as soon as each thread completes. It sort of worked although 
was giving me some inconsistent results somewhere around total number but 
seemed quite there. This looked to me as the last thread sometimes have not 
completed on time....
So I've come up with another version (file-seqC). First the pmap returned the 
list of total files for each thread and then I reduced that to get the final 
result. This approach is solid and I'm quite happy to figure this out but I'm 
curious if it's possible to implement files-seqB using Atoms. I'm still not 
sure how would I know if all the entries in the pmap completes before the 
output is passed further. Shout this used agents instead and completely drop 
the idea of using pmap?

In the example below I've included file-seqB function for reference.
Thanks,
Kuba



(ns pdir.core
 (:gen-class))


(defn burn []
 (dotimes [i 1000]     ;; Make it slow
   (reduce * (map float (take 1000 (iterate inc i)))))
)

; comment out due to problems with uberjar
; Unable to resolve symbol: ttt in this context, compiling:(pdir/core.clj:35:6);
;
;(defn file-seqB
;  "A tree seq on java.io.Files"
;  {:added "1.0"}
;  [dir]
;
;   ;(burn)                 ;; SLOW things down for testing
;
;    (def total (tree-seq
;     (fn [^java.io.File f] (. f (isDirectory)))
;     (fn [^java.io.File d] (seq (. d (listFiles))))
;     dir))
;
;    ; takes the number of the files in the directory and update the atom
;
;     (swap! ttt + (count total))  ; this gives sort of predictable results
;                                  ; but still not accurate
;
;
;    (println "Done: " dir (count total) @ttt)
;
;)

(defn file-seqC
 "A tree seq on java.io.Files"
 {:added "1.0"}
 [dir]
 (let [total                  ; Need to wrap output of the tree-seeq into local 
variable
   (tree-seq                  ; Otherwise using pmap returns won't complete 
before
                              ; (reduce) function kicks in
    (fn [^java.io.File f] (. f (isDirectory)))
    (fn [^java.io.File d] (seq (. d (listFiles))))
    dir) ]

   ;(burn)                 ;; SLOW things down for testing
   (println "Done: " dir (count total))

   (count total) ;Returns a list with summed values
   )

)


(defn -main [& args]
 (def rootPath (nth args 0))
 (println "Root: " rootPath)

 (def f (clojure.java.io/file rootPath))

 ;(println (seq (. f (listFiles))))

 (def subDirs (seq (. f (listFiles))))      ; store the first depth of folders
                                            ; and run them concurrently

 (->> (pmap #(file-seqC %1) subDirs ,,,)    ; returns list of total files in 
each folder
      (reduce + ,,,)                        ; sum them up
      (println "Total Files:" ,,,)          ; print
     )

  ;(shutdown-agents) ;terminate JVM which linger in command line

)

(-main "/data/temp/kuba/aaa")

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to