Re: Reading compressed files (external tables) from hive using DeprecatedLzoTextInputFormat

2012-01-27 Thread alo alt
SET hive.exec.compress.output=true; 
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec = com.hadoop.compression.lzo.LzopCodec;
CREATE EXTERNAL TABLE tmp_hive(domain string,url string)  ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t'  STORED AS INPUTFORMAT 
"com.hadoop.mapred.DeprecatedLzoTextInputFormat"  OUTPUTFORMAT 
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" LOCATION  
'/tmp/test2';

check if LZO available:
io.compression.codecs

- Alex 

--
Alexander Lorenz
http://mapredit.blogspot.com

On Jan 26, 2012, at 12:18 AM, Sam William wrote:

> 
> I have  some data generated from a Pig script  which is LZO compressed.  
> There is no indexer run on this data .   I created an external table on hive  
>  on top of this data . Here is thecreate table script .
> 
> 
> 
> CREATE EXTERNAL TABLE tmp_hive(domain string,url string)  ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY '\t'  STORED AS INPUTFORMAT 
> "com.hadoop.mapred.DeprecatedLzoTextInputFormat"  OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" LOCATION  
> '/tmp/test2';
> 
> However, when I  try to query this table, I get this error . 
> 
> Failed with exception java.io.IOException:java.io.IOException: No LZO codec 
> found, cannot run.
> 
> 
> What am I missing?  Any help is appreciated.
> 
> 
> Thanks,
> Sam William
> sa...@stumbleupon.com
> 
> 
> 



Reading compressed files (external tables) from hive using DeprecatedLzoTextInputFormat

2012-01-25 Thread Sam William

I have  some data generated from a Pig script  which is LZO compressed.  There 
is no indexer run on this data .   I created an external table on hive   on top 
of this data . Here is thecreate table script .



CREATE EXTERNAL TABLE tmp_hive(domain string,url string)  ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t'  STORED AS INPUTFORMAT 
"com.hadoop.mapred.DeprecatedLzoTextInputFormat"  OUTPUTFORMAT 
"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" LOCATION  
'/tmp/test2';

However, when I  try to query this table, I get this error . 

Failed with exception java.io.IOException:java.io.IOException: No LZO codec 
found, cannot run.


What am I missing?  Any help is appreciated.


Thanks,
Sam William
sa...@stumbleupon.com