[ https://issues.apache.org/jira/browse/FLINK-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647727#comment-14647727 ]
James Cao commented on FLINK-1919: ---------------------------------- Thanks! > Add HCatOutputFormat for Tuple data types > ----------------------------------------- > > Key: FLINK-1919 > URL: https://issues.apache.org/jira/browse/FLINK-1919 > Project: Flink > Issue Type: New Feature > Components: Java API, Scala API > Reporter: Fabian Hueske > Assignee: James Cao > Priority: Minor > Labels: starter > > It would be good to have an OutputFormat that can write data to HCatalog > tables. > The Hadoop `HCatOutputFormat` expects `HCatRecord` objects and writes these > to HCatalog tables. We can do the same thing, by creating these `HCatRecord` > object with a Map function that precedes a `HadoopOutputFormat` that wraps > the Hadoop `HCatOutputFormat`. > Better support for Flink Tuples can be added by implementing a custom > `HCatOutputFormat` that also depends on the Hadoop `HCatOutputFormat` but > internally converts Flink Tuples to `HCatRecords`. This would also include to > check if the schema of the HCatalog table and the Flink tuples match. For > data types other than tuples, the OutputFormat could either require a > preceding Map function that converts to `HCatRecords` or let users specify a > MapFunction and invoke that internally. > We have already a Flink `HCatInputFormat` which does this in the reverse > directions, i.e., it emits Flink Tuples from HCatalog tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)