[
https://issues.apache.org/jira/browse/HIVE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Namit Jain reassigned HIVE-3752:
--------------------------------
Assignee: Nitay Joffe
> Add a non-sql API in hive to access data.
> -----------------------------------------
>
> Key: HIVE-3752
> URL: https://issues.apache.org/jira/browse/HIVE-3752
> Project: Hive
> Issue Type: Improvement
> Reporter: Nitay Joffe
> Assignee: Nitay Joffe
>
> We would like to add an input/output format for accessing Hive data in Hadoop
> directly without having to use e.g. a transform. Using a transform
> means having to do a whole map-reduce step with its own disk accesses and its
> imposed structure. It also means needing to have Hive be the base
> infrastructure for the entire system being developed which is not the right
> fit as we only need a small part of it (access to the data).
> So we propose adding an API level InputFormat and OutputFormat to Hive that
> will make it trivially easy to select a table with partition spec and read
> from / write to it. We chose this design to make it compatible with Hadoop so
> that existing systems that work with Hadoop's IO API will just work out of
> the box.
> We need this system for the Giraph graph processing system
> (http://giraph.apache.org/) as running graph jobs which read/write from Hive
> is a common use case.
> [~namitjain] [~aching] [~kevinwilfong] [~apresta]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira