Thanks Bejoy…I tracked down the issue..there was an earlier table (with leo 
definition) that I had not dropped and recreated - hence giving input snappy to 
that was giving issues
Regards
sanjay

From: "bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>" 
<bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, 
"bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>" 
<bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>>
Date: Thursday, May 23, 2013 7:31 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Snappy with HIve

Hi

Please find responses below.

Do I have to give some INPUTFORMAT directive to make the Hive Table read Snappy 
Codec files ?
For example for LZO its
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"

Bejoy : No custom input format required. Add the snappy codec in 
io.compression.codecs.

QUESTION 2
For Hive scripts that will READ Snappy files and Output Snappy Files to Hive 
Tables are the following settings enough ?
SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;

Bejoy: It should be fine. If it shows any issues add 
mapred.output.compress=true as well
Regards
Bejoy KS

Sent from remote device, Please excuse typos
________________________________
From: Sanjay Subramanian 
<sanjay.subraman...@wizecommerce.com<mailto:sanjay.subraman...@wizecommerce.com>>
Date: Tue, 21 May 2013 23:30:09 +0000
To: 
user@hive.apache.org<mailto:user@hive.apache.org><user@hive.apache.org<mailto:user@hive.apache.org>>
ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Snappy with HIve

Hi guys

QUESTION 1
I have an MR job that creates Snappy Codec Output files.
My table definition is as follows
CREATE EXTERNAL TABLE IF NOT EXISTS outpdir_header_hive_only(hbase_pk 
STRING,header_servername_donotquerySTRING,header_date_donotquery STRING, 
header_id STRING, header_hbpk STRING,header_channelId 
INT,header_searchAnnotation STRING,header_continuedSearchFlag 
INT,header_prodLow INT,header_prodTotal INT,header_sort INT,header_view 
INT,header_adNodes INT,header_spellingSuggestion STRING,header_queryType 
INT,header_nodeId INT,header_pinpointPtitleId 
INT,header_firedSearchRulesSTRING,header_rbAbsentSellers INT,header_shuffled 
INT,header_searchSessionId STRING,header_normalizationFlag 
STRING,header_relatedItemResultCount INT,header_unrankedSelectedPtitleIds 
INT,header_normKeyword STRING,header_kplEntry INT,header_isSaved 
STRING,header_rawProfileScore DOUBLE,header_normalizedProfileScore 
INT,header_scorerInfo STRING,header_contextNode INT,header_fbId 
STRING,norm_stem_keyword STRING, attrs_origNodeId INT,attrs_mfrId 
INT,attrs_sellerId INT,attrs_otherAttrs STRING,attrs_ptitleId INT,cached_date 
STRING,cached_recordId STRING,cached_visitorId STRING,cached_visit_id 
STRING,cached_appStyle STRING,cached_publisherId INT,cached_IP 
STRING,cached_source STRING,cached_refkw STRING,cached_pixeled 
INT,cached_searchRefineAttrImps STRING,cached_pageType STRING,cached_zipCode 
STRING,cached_zipType STRING,cached_perpage INT) PARTITIONED BY (header_date 
STRING, header_servername STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

Do I have to give some INPUTFORMAT directive to make the Hive Table read Snappy 
Codec files ?
For example for LZO its
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"


QUESTION 2
For Hive scripts that will READ Snappy files and Output Snappy Files to Hive 
Tables are the following settings enough ?
SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;

Thanks

sanjay

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to