[ https://issues.apache.org/jira/browse/HIVE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nitay Joffe updated HIVE-3752: ------------------------------ Status: Patch Available (was: Open) > Add a non-sql API in hive to access data. > ----------------------------------------- > > Key: HIVE-3752 > URL: https://issues.apache.org/jira/browse/HIVE-3752 > Project: Hive > Issue Type: Improvement > Reporter: Nitay Joffe > Assignee: Nitay Joffe > > We would like to add an input/output format for accessing Hive data in Hadoop > directly without having to use e.g. a transform. Using a transform > means having to do a whole map-reduce step with its own disk accesses and its > imposed structure. It also means needing to have Hive be the base > infrastructure for the entire system being developed which is not the right > fit as we only need a small part of it (access to the data). > So we propose adding an API level InputFormat and OutputFormat to Hive that > will make it trivially easy to select a table with partition spec and read > from / write to it. We chose this design to make it compatible with Hadoop so > that existing systems that work with Hadoop's IO API will just work out of > the box. > We need this system for the Giraph graph processing system > (http://giraph.apache.org/) as running graph jobs which read/write from Hive > is a common use case. > [~namitjain] [~aching] [~kevinwilfong] [~apresta] > Input-side (HiveApiInputFormat) review: https://reviews.facebook.net/D7401 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira