Hi Folks

I've had a look in the various build scripts and in the manifest of the 
jar itself but I am unable to identify which version of thrift is stored 
in thrift.jar. I can't see any references to a maven repository where it 
is pulled down either. Can someone tell me which version of thrift this is 
and where the jar was built and downloaded from ? 

Regards
Steve Watt 



From:
amit jaiswal <[email protected]>
To:
[email protected]
Date:
08/02/2010 01:46 AM
Subject:
Re: How to mount/proxy a db table in hive



The original data is stored in database, and there is no need to create a 
separate copy of the database in HDFS for every job. Extending the notion 
of database, the data can be stored in any storage. One way of abstracting 
out things would to be  implement a InputFormat that knows how to read the 
data, and provide the correct InputSplit and RecordReader implementation. 
The custom input format that I had mentioned works fine in a pure hadoop 
job.

Is it possible to leverage the input format support in hive table creation 
to make such queries. Just 'select * from <table>' API support will also 
be sufficient as the actual sql query can be part of the InputFormat 
implementation.

-amit


From: Sonal Goyal <[email protected]>
To: [email protected]
Sent: Mon, 2 August, 2010 12:03:32 PM
Subject: Re: How to mount/proxy a db table in hive

Hi Amit,

Hive needs data to be stored in its own namespace. Can you please explain 
why you want to call the database through Hive ?
 
Thanks and Regards,
Sonal
www.meghsoft.com
http://in.linkedin.com/in/sonalgoyal


On Mon, Aug 2, 2010 at 11:56 AM, amit jaiswal <[email protected]> wrote:
Hi,

I have a database and am looking for a way to 'mount' the db table in hive 
in
such a way that the select query in hive gets translated to sql query for
database. I saw DBInputFormat and sqoop, but nothing that can create a 
proxy
table in hive which internally makes db calls.

I also tried to use custom variant of DBInputFormat as the input format 
for the
database table.

create table employee (id int, name string) stored as INPUTFORMAT
'mycustominputformat' OUTPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileOutputFormat';

select id from employee;
This fails while running hadoop job because HiveInputFormat only supports
FileSplits.

HiveInputFormat:
   public long getStart() {
     if (inputSplit instanceof FileSplit) {
       return ((FileSplit)inputSplit).getStart();
     }
     return 0;
   }

Any suggestions as if there are any InputFormat implementation that can be
used?

-amit


Reply via email to