Errors while creating a new table using existing table schema

2014-07-18 Thread Vidya Sujeet
Hello, I am trying to create a new table using an existing table's schema (existing table name in hive: jobs). However, when I do that it doesn't put the new table (new table name in hive: jobs_ex2) in the same location as the existing table. When I specify the location explicitly, it errors out.

Re: how to control hive log location on 0.13?

2014-07-18 Thread Lefty Leverenz
Thanks, Satish Mittal, I've added that information to the Error Logs section https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs of the Getting Started wiki. -- Lefty On Fri, Jul 18, 2014 at 12:19 AM, Satish Mittal satish.mit...@inmobi.com wrote: You can

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Make sure the directory you specify has the sticky bit set, otherwise users will have permission problems: chmod 1777 dir On 18 July 2014 14:19, Satish Mittal satish.mit...@inmobi.com wrote: You can configure the following property in $HIVE_HOME/conf/hive-log4j.properties:

ERROR in JDBC

2014-07-18 Thread CHEBARO Abdallah
Hello Hive Community, I am trying to run the JDBC (from cwiki.apache.org), using HiveServer2. Everything in the Java code (attached above) runs well except for the last query: sql = select * from + tableName; Attached is the complete log file of several runs. I have noticed the following

Hive huge 'startup time'

2014-07-18 Thread diogo
This is probably a simple question, but I'm noticing that for queries that run on 1+TB of data, it can take Hive up to 30 minutes to actually start the first map-reduce stage. What is it doing? I imagine it's gathering information about the data somehow, this 'startup' time is clearly a function

Re: Hive huge 'startup time'

2014-07-18 Thread Prem Yadav
may be you can post your partition structure and the query..Over partitioning data is one of the reasons it happens. On Fri, Jul 18, 2014 at 2:36 PM, diogo di...@uken.com wrote: This is probably a simple question, but I'm noticing that for queries that run on 1+TB of data, it can take Hive up

Re: Hive huge 'startup time'

2014-07-18 Thread Edward Capriolo
The planning phase needs to do work for every hive partition and every hadoop files. If you have a lot of 'small' files or many partitions this can take a long time. Also the planning phase that happens on the job tracker is single threaded. Also the new yarn stuff requires back and forth to

Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
Hello everyone. I need some assistance. I have a join that fails with return code 3. The query is; SELECT B.CARD_NBR AS CNT FROM TENDER_TABLE A JOIN LOYALTY_CARDS B ON A.CARD_NBR = B.CARD_NBR LIMIT 10; -- Row Counts -- LOYALTY_CARDS = 43,876,938 -- TENDER_TABLE = 1,412,228,333 The query

Re: Hive Join Running Out of Memory

2014-07-18 Thread Edward Capriolo
This is a failed optimization hive is trying to build the lookup table locally and then put it in the distributed cache and then to a map join. Look through your hive site for the configuration to turn these auto-map joins off. Based on your version the variables changed a names /deprecated etc so

Re: Hive huge 'startup time'

2014-07-18 Thread diogo
Sweet, great answers, thanks. Indeed, I have a small number of partitions, but lots of small files, ~20MB each. I'll make sure to combine them. Also, increasing the heap size of the cli process already helped speed it up. Thanks, again. On Fri, Jul 18, 2014 at 10:26 AM, Edward Capriolo

Re: Hive huge 'startup time'

2014-07-18 Thread Edward Capriolo
Unleash ze file crusha! https://github.com/edwardcapriolo/filecrush On Fri, Jul 18, 2014 at 10:51 AM, diogo di...@uken.com wrote: Sweet, great answers, thanks. Indeed, I have a small number of partitions, but lots of small files, ~20MB each. I'll make sure to combine them. Also, increasing

RE: Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
Thank you. Would it be acceptable to use the following? SET hive.exec.mode.local.auto=false; From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, July 18, 2014 10:45 AM To: user@hive.apache.org Subject: Re: Hive Join Running Out of Memory This is a failed optimization hive is

Re: Hive Join Running Out of Memory

2014-07-18 Thread Edward Capriolo
I believe that would be the one. On Fri, Jul 18, 2014 at 10:54 AM, Clay McDonald stuart.mcdon...@bateswhite.com wrote: Thank you. Would it be acceptable to use the following? SET hive.exec.mode.local.auto=false; From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, July 18,

RE: Hive Join Running Out of Memory

2014-07-18 Thread Clay McDonald
I changed the hive.auto.convert.join.noconditionaltask = false in the hive site and that seemed to do the trick. Thanks! From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, July 18, 2014 10:57 AM To: user@hive.apache.org Subject: Re: Hive Join Running Out of Memory I believe

Hive support for filtering Unicode data

2014-07-18 Thread Duc le anh
Hello Hive, I posted the below question http://stackoverflow.com/questions/24817308/hive-support-for-filtering-unicode-data?noredirect=1#comment38534961_24817308 on Stackoverflow

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
thanks guys. anybody knows what generates the log like myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log ? I checked our application code, it doesn't generate this, looks from hive. On Fri, Jul 18, 2014 at 12:28 AM, Andre Araujo ara...@pythian.com wrote: Make sure the directory

Re: how to control hive log location on 0.13?

2014-07-18 Thread Lefty Leverenz
Thanks André, I've added the sticky bit advice to Error Logs https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs . -- Lefty On Fri, Jul 18, 2014 at 2:38 PM, Yang tedd...@gmail.com wrote: thanks guys. anybody knows what generates the log like

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
Can you give us an excerpt of the contents of this log? On 19 July 2014 04:38, Yang tedd...@gmail.com wrote: thanks guys. anybody knows what generates the log like myuser_20140716143232_d76043ed-1c4b-42a0-bf0a-2816377a6a2a.log ? I checked our application code, it doesn't generate

Re: how to control hive log location on 0.13?

2014-07-18 Thread Andre Araujo
and where is it located? On 19 July 2014 10:58, Andre Araujo ara...@pythian.com wrote: Can you give us an excerpt of the contents of this log? On 19 July 2014 04:38, Yang tedd...@gmail.com wrote: thanks guys. anybody knows what generates the log like

Re: Hive huge 'startup time'

2014-07-18 Thread Db-Blog
Hello everyone, Thanks for sharing valuable inputs. I am working on similar kind of task, it will be really helpful if you can share the command for increasing the heap size of hive-cli/launching process. Thanks, Saurabh Sent from my iPhone, please avoid typos. On 18-Jul-2014, at 8:23

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
2014-07-18 15:03:37,774 INFO mr.ExecDriver (SessionState.java:printInfo(537)) - Execution log at: /tmp/myuser/myuser_2014071815030 3_56bf6bb0-db30-4dbc-807c-9023ce4103f4.log 2014-07-18 15:03:37,864 WARN conf.Configuration (Configuration.java:loadProperty(2358)) -

Re: how to control hive log location on 0.13?

2014-07-18 Thread Yang
it's in /tmp/my_user/ the funny thing is that I already have a hive.log there. On Fri, Jul 18, 2014 at 6:01 PM, Andre Araujo ara...@pythian.com wrote: and where is it located? On 19 July 2014 10:58, Andre Araujo ara...@pythian.com wrote: Can you give us an excerpt of the contents of