[ https://issues.apache.org/jira/browse/FLINK-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777496#comment-16777496 ]
vinoyang commented on FLINK-11737: ---------------------------------- [~StephanEwen] updated. Constructing {{org.apache.hadoop.mapreduce.lib.output.MultipleOutputs}} in hadoop requires an instance of the {{TaskInputOutputContext}} interface, and the most common implementation of this interface is {{ReduceContextImpl}}. The Construction of {{ReduceContextImpl}} requires {{RawKeyValueIterator}} (requires an Iterator). The lowest-level {{OutputFormat}} in Flink is a single message output model (OutputFormat#writeRecord). Currently, to use {{MultipleOutputs}}, I can only use an {{MapPartitionFunction}} to get an {{Iterator}}. What do you think of this issue? cc [~fhueske] > Support org.apache.hadoop.mapreduce.lib.output.MultipleOutputs output > --------------------------------------------------------------------- > > Key: FLINK-11737 > URL: https://issues.apache.org/jira/browse/FLINK-11737 > Project: Flink > Issue Type: Improvement > Components: Batch Connectors and Input/Output Formats > Reporter: vinoyang > Assignee: vinoyang > Priority: Major > > This issue is to improve Flink's compatibility with Hadoop. Currently, for > the old version of the Hadoop API, there is > {{org.apache.hadoop.mapred.lib.MultipleOutputFormat}}, which can be used > directly. However, for the new version of the Hadoop API > {{org.apache.hadoop.mapreduce.lib.output.MultipleOutputs}}, the current Flink > cannot be supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)