----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3265/ -----------------------------------------------------------
Review request for mahout. Summary ------- MAHOUT-922-2: add DistributedCache broadcast to B' files for AB' job and R-hat files for B' job, on by default, governed by -br option. Notes: Performance: I did not notice the difference between using distributed cache vs. opening direct streams, which is understandable since jobs are cpu-bound. I did have to add some functionality to multifile sequence file iterators to allow for specifying multiple files coming from distributed cache which is neither glob nor directory. I also added fixes for some corner case NPEs there. Sorry eclipse reformatting for style is a bit different from original Sean's formatting in Intellij, it is hard to adjust it exactly. This addresses bug MAHOUT-922. https://issues.apache.org/jira/browse/MAHOUT-922 Diffs ----- Diff: https://reviews.apache.org/r/3265/diff Testing ------- Thanks, Dmitriy