hadoop fs -text cannot get .deflate file decompressed

2013-01-30 Thread Richard
I got some hive generated files with .defate extension. I know this is a compressed file. It is not my data so i canot change the option to uncompressed. I just want to view the file content. But when I used hadoop fs -text, i cannot get plaintext output. The output is still binary. How can I fi

Re: delay before query starts processing

2013-01-30 Thread Ariel Marcus
>From the archives: http://mail-archives.apache.org/mod_mbox/hive-user/201110.mbox/%3CCAC9SPjuQtxOK1KtEmReD6OanNTgNM_uLkGQD+=n7krcjcal...@mail.gmail.com%3E TL;DR set hive.optimize.s3.query=true; - Ariel Marcus, Consultant www.openbi.com | ariel.mar...@openbi.com 15

Re: delay before query starts processing

2013-01-30 Thread Abdelrahman Shettia
Hi Marc, You can try running the hive client with debug mode on and see what is trying to do on the JT level. hive -hiveconf hive.root.logger=ALL,console -e " DDL;" hive -hiveconf hive.root.logger=ALL,console -f ddl.sql ; Hope this helps . Thanks -Abdelrahman On Wed, Jan 30, 2013 at 3:16 PM, M

Re: ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread Nitin Pawar
it will not work old partition because old data did not have this new column as metadata for old partition your new meta data applies only to new partitions always remember there is nothing called update or alter row on hive. alter is only on the table meta data from that time onwards if you rea

Re: ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread Mark Grover
Hardik, The schema is associated per partition. It sounds to me that the structure of your data remains the same, you are just expressing it differently in your Hive table. If you are table is external this is no biggie, just drop the external table, re-create it and re-add the partitions. If not,

Re: ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread hardik doshi
Thanks, Nitin & Dean. My hive table is backed by data files in hdfs and they do contain the additional field that I am adding in my hive table schema. I noticed that if I remove partitions and recreate them after changing the column type, it works. But it does not work on old partition for some

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread John Omernik
I am realizing one of my challenges is that I have quite a few cores and map tasks per node, but (I didn't set it up) I am only running 4 GB per physical core (12) with 18 map slots. I am guessing right now that any given time, with 18 map slots, the 1.8 total GB of ram I am assigning to to the so

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread Dean Wampler
We didn't ask yet, but to be sure, are all the slave nodes configured the same, both in terms of hardware and other apps running, if any, running on them? On Wed, Jan 30, 2013 at 10:14 AM, Richard Nadeau wrote: > What do you have set in core-site.XML for io.sort.mb, io.sort.factor, and > io.file.

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread Richard Nadeau
What do you have set in core-site.XML for io.sort.mb, io.sort.factor, and io.file.buffer.size? You should be able to adjust these and get past the heap issue. Be careful about how much ram you ave though, and don't st them too high. Rick On Jan 30, 2013 8:55 AM, "John Omernik" wrote: > So it's f

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread John Omernik
So it's filling up on the emitting stage, so I need to look at the task logs and or my script that's printing to stdout as the likely culprits I am guessing. On Wed, Jan 30, 2013 at 9:11 AM, Philip Tromans wrote: > That particular OutOfMemoryError is happening on one of your hadoop nodes. > It'

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread Philip Tromans
That particular OutOfMemoryError is happening on one of your hadoop nodes. It's the heap within the process forked by the hadoop tasktracker, I think. Phil. On 30 January 2013 14:28, John Omernik wrote: > So just a follow-up. I am less looking for specific troubleshooting on how > to fix my pr

Re: The dreaded Heap Space Issue on a Transform

2013-01-30 Thread John Omernik
So just a follow-up. I am less looking for specific troubleshooting on how to fix my problem, and more looking for a general understanding of heap space usage with Hive. When I get an error like this, is it heap space on a node, or heap space on my hive server? Is it the heap space of the tasktra

Re: Run hive queries, and collect job information

2013-01-30 Thread Mathieu Despriee
Fantastic. Thanks ! 2013/1/30 Qiang Wang > Every hive query has a history file, and you can get these info from hive > history file > > Following java code can be an example: > > https://github.com/anjuke/hwi/blob/master/src/main/java/org/apache/hadoop/hive/hwi/util/QueryUtil.java > > Regard, >

Re: ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread Dean Wampler
Right, the very important thing to remember about ALTER TABLE is that it only changes metadata about your table. It doesn't modify the data in any way. You have to do that yourself. On Wed, Jan 30, 2013 at 2:17 AM, Nitin Pawar wrote: > after u did alter table, did you add any new data to table wi

Re: Run hive queries, and collect job information

2013-01-30 Thread Nitin Pawar
for all the queries you run as user1 .. hive stores the hive cli history into .hive_history file (please check the limits on how many queries it stores) For all the jobs hive cli runs, it keeps the details in /tmp/user.name/ all these values are configurable into hive-site.xml On Wed, Jan 30, 2

Re: Run hive queries, and collect job information

2013-01-30 Thread Qiang Wang
Every hive query has a history file, and you can get these info from hive history file Following java code can be an example: https://github.com/anjuke/hwi/blob/master/src/main/java/org/apache/hadoop/hive/hwi/util/QueryUtil.java Regard, Qiang 2013/1/30 Mathieu Despriee > Hi folks, > > I would

Run hive queries, and collect job information

2013-01-30 Thread Mathieu Despriee
Hi folks, I would like to run a list of generated HIVE queries. For each, I would like to retrieve the MR job_id (or ids, in case of multiple stages). And then, with this job_id, collect statistics from job tracker (cumulative CPU, read bytes...) How can I send HIVE queries from a bash or python

Re: ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread Nitin Pawar
after u did alter table, did you add any new data to table with new schema? for the old data already present in data, if you add anything new in columns it will be null value On Wed, Jan 30, 2013 at 1:44 PM, hardik doshi wrote: > Hi, > > I am running into an issue where ALTER TABLE CHANGE COLU

ALTER TABLE CHANGE COLUMN issue

2013-01-30 Thread hardik doshi
Hi, I am running into an issue where ALTER TABLE CHANGE COLUMN does not seem to be working. I have a table with a column data type looking like array> and I am trying to it change to array> based on the underlying data schema change. The alter command succeeds and subsequent describe call sho