can you describe a bit more on the format of the input file?

is it a set of serialized thrift records of the same class type? the current 
ThriftDeserializer expects serialized records to be embedded inside a 
BytesWritable (we make sure of this during the loading process) - but probably 
not the scenario for most people (we haven't gotten around to fixing this yet)

-----Original Message-----
From: Stephen Corona [mailto:scor...@adknowledge.com] 
Sent: Friday, March 06, 2009 8:05 PM
To: hive-user@hadoop.apache.org
Subject: RE: Querying JSON/Thrift data?

I took a look at this class and tried to give it a shot.. I'm not exactly sure 
what the create table syntax should look like. I tried this:

hive> create table testing ( uid int, name string ) 
    > row format serializer 'org.apache.hadoop.hive.serde2.ThriftDeserializer'
    > ;
FAILED: Parse Error: line 2:7 mismatched input 'table' expecting TEMPORARY in 
create function statement

Steve Corona
________________________________________
From: Prasad Chakka [pra...@facebook.com]
Sent: Friday, March 06, 2009 7:33 PM
To: hive-user@hadoop.apache.org
Subject: Re: Querying JSON/Thrift data?

Can you use ThriftDeserializer? Look at Complex class to see how it is used.

Prasad


________________________________
From: Stephen Corona <scor...@adknowledge.com>
Reply-To: <hive-user@hadoop.apache.org>
Date: Fri, 6 Mar 2009 16:02:02 -0800
To: <hive-user@hadoop.apache.org>
Subject: RE: Querying JSON/Thrift data?



________________________________________
From: Stephen Corona
Sent: Friday, March 06, 2009 6:16 PM
To: hive-user-subscr...@hadoop.apache.org
Subject: Querying JSON/Thrift data?

Hey guys,

I am currently loading data into Hive in a CSV delimited format. This works but 
turns out to be a huge pain when adding and removing columns (since they can 
only be added to the end of the table). Is there any way to load and query data 
that's in some sort of JSON/thrift format? That way the data is already 
associated with some column and not just in a seemingly arbitrary data format? 
I am pretty open on which format to use and how to load it into Hive. FWIW, Our 
data is generated in PHP and pushed to Scribe. Scribe aggregates the CSV files 
and we load them into Hive every night.

Thanks,

Steve

Reply via email to