Hi, Any pointers as what needs to be done for implementing storage handler for this functionality? What all things need to be taken care of. Will it be a small change, or something big?
-regards Amit ----- Original Message ---- From: Edward Capriolo <[email protected]> To: [email protected] Sent: Mon, 2 August, 2010 7:58:55 PM Subject: Re: How to mount/proxy a db table in hive On Mon, Aug 2, 2010 at 2:33 AM, Sonal Goyal <[email protected]> wrote: > Hi Amit, > > Hive needs data to be stored in its own namespace. Can you please explain > why you want to call the database through Hive ? > > Thanks and Regards, > Sonal > www.meghsoft.com > http://in.linkedin.com/in/sonalgoyal > > > On Mon, Aug 2, 2010 at 11:56 AM, amit jaiswal <[email protected]> wrote: >> >> Hi, >> >> I have a database and am looking for a way to 'mount' the db table in hive >> in >> such a way that the select query in hive gets translated to sql query for >> database. I saw DBInputFormat and sqoop, but nothing that can create a >> proxy >> table in hive which internally makes db calls. >> >> I also tried to use custom variant of DBInputFormat as the input format >> for the >> database table. >> >> create table employee (id int, name string) stored as INPUTFORMAT >> 'mycustominputformat' OUTPUTFORMAT >> 'org.apache.hadoop.mapred.SequenceFileOutputFormat'; >> >> select id from employee; >> This fails while running hadoop job because HiveInputFormat only supports >> FileSplits. >> >> HiveInputFormat: >> public long getStart() { >> if (inputSplit instanceof FileSplit) { >> return ((FileSplit)inputSplit).getStart(); >> } >> return 0; >> } >> >> Any suggestions as if there are any InputFormat implementation that can be >> used? >> >> -amit > > Maybe the new 'strorage handlers' would help. Storage handlers tie together, input formats, serde's and create/drop table functions. a JDBC backend storage handler would be a pretty neat thing.
