Once I run a map-reduce job I get output in the form of part-r-00000 part-r-00001 ...
In many cases the output is significantly smaller than the original input - take the classic word count In most cases I want to combine the output into a single file that may well not live on HDFS but on a more accessible file system Are there standard libraries or approaches for consolidating reducer output. A second Map-Reduce job taking the output directory as an input is an OK start but as output there needs to be a single reducer that writes a real file and not reduce output - Are there standard libraries or approaches to this????? -- Steven M. Lewis PhD 4221 105th Ave Ne Kirkland, WA 98033 206-384-1340 (cell) Institute for Systems Biology Seattle WA