That is really interesting…let me try and think of a reason…meanwhile any other 
LZO Hive Samurais out there ? Please help with some guidance

sanjay

From: w00t w00t <w00...@yahoo.de<mailto:w00...@yahoo.de>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, w00t w00t 
<w00...@yahoo.de<mailto:w00...@yahoo.de>>
Date: Wednesday, August 14, 2013 1:15 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Hive and Lzo Compression


Thanks for your reply.

The interesting thing I experience is that the SELECT query still works - even 
when I do not specify the STORED AS clause... that puzzles me a bit.

________________________________
Von: Sanjay Subramanian 
<sanjay.subraman...@wizecommerce.com<mailto:sanjay.subraman...@wizecommerce.com>>
An: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>; w00t w00t 
<w00...@yahoo.de<mailto:w00...@yahoo.de>>
Gesendet: 3:44 Mittwoch, 14.August 2013
Betreff: Re: Hive and Lzo Compression

Hi

I think the CREATE TABLE without the STORED AS clause will not give any errors 
while creating the table.
However when you query that table and since that table contains .lzo files , 
you would  get errors.
With external tables , u r separating the table creation(definition) from the 
data. So only at the time of query of that table, hive might report errors.

LZO compression rocks ! I am so glad I used it in our projects here.

Regards

sanjay

From: w00t w00t <w00...@yahoo.de<mailto:w00...@yahoo.de>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, w00t w00t 
<w00...@yahoo.de<mailto:w00...@yahoo.de>>
Date: Tuesday, August 13, 2013 12:13 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Hive and Lzo Compression

Thanks for your replies and the link.

I could get it working, but wondered why the CREATE TABLE statement worked 
without the STORED AS Clause as well...that's what puzzles me a bit...

But I will use the STORED AS Clause to be on the safe side.


________________________________
Von: Lefty Leverenz <leftylever...@gmail.com<mailto:leftylever...@gmail.com>>
An: user@hive.apache.org<mailto:user@hive.apache.org>
CC: w00t w00t <w00...@yahoo.de<mailto:w00...@yahoo.de>>
Gesendet: 19:06 Samstag, 10.August 2013
Betreff: Re: Hive and Lzo Compression

I'm not seeing any documentation link in Sanjay's message, so here it is again 
(in the Hive wiki's language manual):  
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO.


On Thu, Aug 8, 2013 at 3:30 PM, Sanjay Subramanian 
<sanjay.subraman...@wizecommerce.com<mailto:sanjay.subraman...@wizecommerce.com>>
 wrote:
Please refer this documentation here
Let me know if u need more clarifications so that we can make this document 
better and complete

Thanks

sanjay

From: w00t w00t <w00...@yahoo.de<mailto:w00...@yahoo.de>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, w00t w00t 
<w00...@yahoo.de<mailto:w00...@yahoo.de>>
Date: Thursday, August 8, 2013 2:02 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Hive and Lzo Compression


Hello,

I am started to run Hive with Lzo compression on Hortonworks 1.2

I have managed to install/configure Lzo and  hive -e "set 
io.compression.codecs" shows me the Lzo Codecs:
io.compression.codecs=
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec,
org.apache.hadoop.io.compress.BZip2Codec

However, I have some questions where I would be happy if you could help me.

(1) CREATE TABLE statement

I read in different postings, that in the CREATE TABLE statement, I have to use 
the following STORAGE clause:

CREATE EXTERNAL TABLE txt_table_lzo (
   txt_line STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'
STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/user/myuser/data/in/lzo_compressed';

It works withouth any problems now to execute SELECT statements on this table 
with Lzo data.

However I also created a table on the same data without this STORAGE clause:

CREATE EXTERNAL TABLE txt_table_lzo_tst (
   txt_line STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'
LOCATION '/user/myuser/data/in/lzo_compressed';

The interesting thing is, it works as well, when I execute a SELECT statement 
and this table.

Can you help, why the second CREATE TABLE statement works as well?
What should I use in DDLs?
Is it best practice to use the STORED AS clause with a 
"deprecatedLzoTextInputFormat"? Or should I remove it?


(2) Output and Intermediate Compression Settings

I want to use output compression .

In "Programming Hive" from Capriolo, Wampler, Rutherglen the following commands 
are recommended:
SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

          However, in some other places in forums, I found the following 
recommended settings:
SET hive.exec.compress.output=true
SET mapreduce.output.fileoutputformat.compress=true
SET 
mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec

Am I right, that the first settings are for Hadoop versions prior 0.23?
Or is there any other reason why the settings are different?

I am using Hadoop 1.1.2 with Hive 0.10.0.
Which settings would you recommend to use?

--------------
          I also want to compress intermediate results.

         Again, in  "Programming Hive" the following settings are recommended:
         SET hive.exec.compress.intermediate=true;
         SET 
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

          Is this the right setting?

          Or should I again use the settings (which look more valid for Hadoop 
0.23 and greater)?:
          SET hive.exec.compress.intermediate=true;
          SET 
mapreduce.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

Thanks




CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.



-- Lefty



CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.



CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to