What about .gz OR tar file. Does this unzip require at HDFS and load into hive? 
How you resolve it.


Sent from my BlackBerry, pls excuse typo

-----Original Message-----
From: "Connell, Chuck" <chuck.conn...@nuance.com>
Date: Sun, 30 Sep 2012 12:24:37 
To: user@hive.apache.org<user@hive.apache.org>; Savant, 
Keshav<keshav.c.sav...@fisglobal.com>
Reply-To: user@hive.apache.org
Subject: RE: zip file or tar file cosumption

I have seen that error when I try to overwrite an existing file.

But, more importantly, Hive cannot understand ZIP files. There was a long 
thread about this just a few days ago. Your table def says "stored as textfile" 
but you are not giving it a text file.

Chuck


________________________________
From: Manish [manishbh...@rocketmail.com]
Sent: Sunday, September 30, 2012 7:38 AM
To: Savant, Keshav
Cc: user@hive.apache.org
Subject: RE: zip file or tar file cosumption


I am getting below error when loading zip file

Driver returned: 9.  Errors: Hive history 
file=/tmp/hue/hive_job_log_hue_201209300434_1768401171.txt
Loading data to table default.pageview_zip
Failed with exception Error moving: 
hdfs://localhost:54310/user/manish/input/zip/11sep12.zip into: 
/user/manish/input/zip
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask

My load statement is: LOAD DATA INPATH '/user/manish/input/11sep12.zip' 
OVERWRITE INTO TABLE `pageview_zip`

Table definition:
CREATE external TABLE pageview_zip
(
C_0 STRING,
C_1 STRING,
C_7 MAP<STRING,STRING>,
C_8 STRING,
C_13 MAP<STRING,STRING>,
C_21 STRING
)
COMMENT 'Page View'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' COLLECTION ITEMS TERMINATED BY 
';' MAP KEYS TERMINATED BY '='
STORED AS TEXTFILE LOCATION '/user/manish/input/zip'

Thank You,
Manish



On Thu, 2012-09-27 at 11:11 +0000, Savant, Keshav wrote:
True Manish.



Keshav C Savant




From: Manish.Bhoge [mailto:manish.bh...@target.com]
Sent: Thursday, September 27, 2012 4:26 PM
To: user@hive.apache.org; manishbh...@rocketmail.com
Subject: RE: zip file or tar file cosumption




Thanks Savant. I believe this will hold good for .zip file also.



Thank You,

Manish.



From: Savant, Keshav [mailto:keshav.c.sav...@fisglobal.com]
Sent: Thursday, September 27, 2012 10:19 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>; 
manishbh...@rocketmail.com<mailto:manishbh...@rocketmail.com>
Subject: RE: zip file or tar file cosumption




Manish the table that has been created for zipped text files should be defined 
as sequence file, for example



CREATE TABLE my_table_zip(col1 STRING,col2 STRING) ROW FORMAT DELIMITED FIELDS 
TERMINATED BY ',' stored as sequencefile;



After this you can use regular load command to load these files, for example



load data local inpath 'path-to-csv-file.gz' into table my_table_zip;



hope this helps



Keshav C Savant




From: Manish Bhoge [mailto:manishbh...@rocketmail.com]
Sent: Wednesday, September 26, 2012 9:43 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: zip file or tar file cosumption




Hi Richin,

Thanks! Yes this is what I wanted to understand how to load zip file to Hive 
table. Now, I'll try this option.

Thank You,
Manish.

Sent from my BlackBerry, pls excuse typo


________________________________
From:<richin.j...@nokia.com<mailto:richin.j...@nokia.com>>


Date:Wed, 26 Sep 2012 14:51:39 +0000


To:<user@hive.apache.org<mailto:user@hive.apache.org>>


ReplyTo:user@hive.apache.org<mailto:user@hive.apache.org>


Subject:RE: zip file or tar file cosumption





You are right Chuck. I thought his question was how to use zip files or any 
compressed files in Hive tables.



Yeah, seems like you can’t do that 
see:http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%3CCAENxBwxkF--3PzCkpz1HX21=gb9yvasr2jl0u3yul2tfgu0...@mail.gmail.com%3E

But you can always compress your files in gzip format and they should be good 
to go.



Richin



From: ext Connell, Chuck [mailto:chuck.conn...@nuance.com]
Sent: Wednesday, September 26, 2012 10:44 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: RE: zip file or tar file cosumption




But TEXTFILE in Hive always has newline as the record delimiter. How could this 
possibly work with a zip/tar file that can contain ASCII 10 characters at 
random locations, and certainly does not have ASCII 10 at the end of each data 
record?



Chuck Connell

Nuance R&D Data Team

Burlington, MA






From:richin.j...@nokia.com<mailto:richin.j...@nokia.com> 
[mailto:richin.j...@nokia.com]
Sent: Wednesday, September 26, 2012 10:14 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>; 
manishbh...@rocketmail.com<mailto:manishbh...@rocketmail.com>
Subject: RE: zip file or tar file cosumption




Hi Manish,



If you have your zip file at location -  /home/manish/zipfile, you can just 
point your external table to that location like

CREATE EXTERNAL TABLE manish_test (field1 string, field2 string) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY <your_column_delimiter> STORED AS TEXTFILE 
LOCATION ‘/home/manish/zipfile’;



OR



If you already have external table pointing to a certain location you can load 
this zip file into your table as

LOAD DATA INPATH ‘/home/manish/zipfile’ INTO TABLE manish_test;



Hope this helps.



Richin



From: ext Manish Bhoge [mailto:manishbh...@rocketmail.com]
Sent: Wednesday, September 26, 2012 9:13 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: zip file or tar file cosumption




Hi Savant,

Got it. But I still need to understand that how to load zip? Can I directly use 
zip file in external table. can u pls help to get the load statement.

Sent from my BlackBerry, pls excuse typo


________________________________
From:"Savant, Keshav" 
<keshav.c.sav...@fisglobal.com<mailto:keshav.c.sav...@fisglobal.com>>


Date:Wed, 26 Sep 2012 12:25:38 +0000


To:user@hive.apache.org<user@hive.apache.org<mailto:user@hive.apache.org%3cu...@hive.apache.org>>


ReplyTo:user@hive.apache.org<mailto:user@hive.apache.org>


Cc:manish.bh...@target.com<manish.bh...@target.com<mailto:manish.bh...@target.com%3cmanish.bh...@target.com>>;
 
chuck.conn...@nuance.com<chuck.conn...@nuance.com<mailto:chuck.conn...@nuance.com%3cchuck.conn...@nuance.com>>


Subject:RE: zip file or tar file cosumption





Another solution would be



Using shell script do following

1.      unzip txt files,

2.      one by one merge those 50 (or N number of) text files into one text 
file,

3.      then the zip/tar that bigger text file,

4.      then that big zip/tar file can be uploaded into hive.



Keshav C Savant




From: Connell, Chuck 
[mailto:chuck.conn...@nuance.com]<mailto:[mailto:chuck.conn...@nuance.com]>
Sent: Wednesday, September 26, 2012 4:04 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: RE: zip file or tar file cosumption




This could be a problem. Hive uses newline as the record separator. A ZIP file 
will certainly newline characters. So I doubt this is possible.

BUT, I would like to hear from anyone who has solved the "newline is always a 
record separator" problem, because we ran into it for another type of 
compressed file.

Chuck

________________________________
From: Manish.Bhoge [manish.bh...@target.com]
Sent: Wednesday, September 26, 2012 3:17 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: zip file or tar file cosumption


Hivers,



I want to understand that would it be possible to utilize zip/tar files 
directly into Hive. All the files has similar schema (structure).  Say 50 *.txt 
files are zipped into a single zip file can we load data directly from this zip 
file OR should we need to unzip first?



Thanks & Regards

Manish Bhoge | Technical Architect ¤TargetDW/BI|( +919379850010 (M) Ext: 5691 
VOIP: 22165 |! “Excellence is not a skill, It is an attitude.” 
MySite<http://mysites.target.com/personal/z063783>




_____________
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.


_____________
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.


_____________
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.




Reply via email to