Re: Hive parallel execution deadlocks, need restart of yarn-nodemanager

2012-12-07 Thread Alexandre Fouche
Ah i see, i had missed the fact that each MR jobs had an ApplicationManager that was taking a container, there were none free to run mappers (my jobs usually have only one mapper due to small input data). I understood that thanks to your explanations and using more nodes with a greater

Set the number of mapper in hive

2012-12-07 Thread Philips Kokoh Prasetyo
Hi everyone, My cluster also runs HBase for real time processing. Hive query (on a big table) occupies all the map tasks so that the other service cannot run properly. Does anyone know how to limit the number of running map in hive? I see mapred.reduce.tasks in the configuration properties, but I

Re: How to set an empty value to hive.querylog.location to disable the creation of hive history file

2012-12-07 Thread Bing Li
do you mean NOT support disable the creation of hive history files OR NOT support using an empty string to achieve this? If Hive doesn't support disable the creation of query logs, do you know the reason? Thanks, - Bing 2012/12/6 Hezhiqiang (Ransom) ransom.hezhiqi...@huawei.com It’s not

Re: Set the number of mapper in hive

2012-12-07 Thread Nitin Pawar
ways to handle this 1) create separate job queues for hive and hbase users on jt and allocate resources according to your needs 2) you can not actually limit how many maps can be launched as its decided on run time by looking at split size. If you want less number of maps to be launched then

Re: Set the number of mapper in hive

2012-12-07 Thread Philips Kokoh Prasetyo
Hi Nitin, Thanks for the reply. Do you mean using fair scheduler to separate job queue? http://hadoop.apache.org/docs/r0.20.2/fair_scheduler.html Regards, Philips On Fri, Dec 7, 2012 at 5:15 PM, Nitin Pawar nitinpawar...@gmail.com wrote: ways to handle this 1) create separate job queues

Re: Set the number of mapper in hive

2012-12-07 Thread Nitin Pawar
Yes On Dec 7, 2012 3:09 PM, Philips Kokoh Prasetyo philipsko...@gmail.com wrote: Hi Nitin, Thanks for the reply. Do you mean using fair scheduler to separate job queue? http://hadoop.apache.org/docs/r0.20.2/fair_scheduler.html Regards, Philips On Fri, Dec 7, 2012 at 5:15 PM, Nitin

Map side join

2012-12-07 Thread Souvik Banerjee
Hello everybody, I have got a question. I didn't came across any post which says somethign about this. I have got two tables. Lets say A and B. I want to join A B in HIVE. I am currently using HIVE 0.9 version. The join would be on few columns. like on (A.id1 = B.id1) AND (A.id2 = B.id2) AND

Re: Map side join

2012-12-07 Thread bejoy_ks
Hi Souvik In earlier versions of hive you had to give the map join hint. But in later versions just set hive.auto.convert.join = true; Hive automatically selects the smaller table. It is better to give the smaller table as the first one in join. You can use a map join if you are joining a

Re: Hive double-precision question

2012-12-07 Thread Johnny Zhang
Hi, Periya: This is a problem to me also. I filed https://issues.apache.org/jira/browse/HIVE-3715 I have a patch working in local. I am doing more tests right now will post it soon. Thanks, Johnny On Fri, Dec 7, 2012 at 1:27 PM, Periya.Data periya.d...@gmail.com wrote: Hi Hive Users, I

Re: Hive double-precision question

2012-12-07 Thread Mark Grover
Periya: If you want to see what the built in Hive UDFs are doing, the code is here: https://github.com/apache/hive/tree/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic and https://github.com/apache/hive/tree/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf You can find out which UDF name

RE: Hive double-precision question

2012-12-07 Thread Lauren Yang
This sounds like https://issues.apache.org/jira/browse/HIVE-2586 , where comparing float/doubles will not work because of the way floating point numbers are represented. Perhaps there is a comparison between a float and double type because of some internal representation in the Java library,

Re: Hive double-precision question

2012-12-07 Thread Periya.Data
Thanks Lauren, Mark Grover and Zhang. Will have to see the source code in Hive to see what is happening and if I can make the results consistent... Interested to see Zhang's patch. I shall watch that Jira. -PD On Fri, Dec 7, 2012 at 2:12 PM, Lauren Yang lauren.y...@microsoft.comwrote: This

Re: Hive double-precision question

2012-12-07 Thread Periya.Data
Hi Mark, Thanks for the pointers. I looked at the code and it looks like my Java code and the Hive code are similar...(I am a basic-level Java guy). The UDF below uses Math.sinwhich is what I used to test linux + Java result. I have to see what this DoubleWritable and Serde2 is all about...

Load data in (external table) from symbolic link

2012-12-07 Thread Hadoop Inquirer
Hi, I am trying to create an external table in Hive by pointing it to a file that has symbolic links in its path reference. Hive seems to complain with the following error indicating that it thinks the symbolic link is a file: java.io.IOException: Open failed for file: /dir1/dir2/dir3_symlink,

PK violation during Hive add partition

2012-12-07 Thread Karlen Lie
Hello, We are running into intermittent errors while running the below query. Some background on this, our table (tbl_someTable) that we're altering is an external table, and the query below is run concurrently by multiple oozie workflows. ALTER TABLE tbl_someTable ADD IF NOT EXISTS