vinoyang created FLINK-12003:
--------------------------------

             Summary: Revert the config option about mapreduce.output.basename 
in HadoopOutputFormatBase
                 Key: FLINK-12003
                 URL: https://issues.apache.org/jira/browse/FLINK-12003
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Hadoop Compatibility
            Reporter: vinoyang
            Assignee: vinoyang


In {{HadoopOutputFormatBase}} open method, the config option 
{{mapreduce.output.basename}} was changed to "tmp" and there is not any 
documentation state this change.

By default, HDFS will use this format "part-x-yyyyy" to name its file, the x 
and y means : 
 * {{x}} is either 'm' or 'r', depending on whether the job was a map only job, 
or reduce
 * {{yyyyy}} is the mapper or reducer task number (zero based)

 

The keyword "part" has used in many place in user's business logic to match the 
hdfs's file name. So I suggest to revert this config option or document it.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to