[jira] Created: (HADOOP-2237) Streaming command should be able to produced multiple outputs stored as separate DFS data sets

arkady borkovsky (JIRA) Tue, 20 Nov 2007 14:20:03 -0800

Streaming command should be able to produced multiple outputs stored as 
separate DFS data sets
----------------------------------------------------------------------------------------------


                 Key: HADOOP-2237
                 URL: https://issues.apache.org/jira/browse/HADOOP-2237
             Project: Hadoop
          Issue Type: Improvement
            Reporter: arkady borkovsky


Some streaming commands in map or reduce phase, as a "side effect", produce 
several output files.
The names of output files may be hard coded, or specified on the command line.

Streaming infrastructure should allow to get these files copied into DFS. 
For each distinct "output file name", a separate DFS dataset (DFS directory) 
should be created, and a file of an individual task should be stored there as a 
part file.   The names of directories may be derived from the main "output 
name" (default)

Related to https://issues.apache.org/jira/browse/HADOOP-2236:  in case of 
reduce, a single name output file may be seen as a special case if this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2237) Streaming command should be able to produced multiple outputs stored as separate DFS data sets

Reply via email to