Re: 回复: BUG IN HIVE-4650 seems not fixed

2013-07-31 Thread Yin Huai
I just uploaded a patch to https://issues.apache.org/jira/browse/HIVE-4968.
You can try it and see if the problem has been resolved for your query.


On Wed, Jul 31, 2013 at 11:21 AM, Yin Huai  wrote:

> Seems it is another problem.
> Can you try
>
>
> SELECT *
> FROM (SELECT VAL001 x1,
>  VAL002 x2,
>  VAL003 x3,
>  VAL004 x4,
>  VAL005 y
>   FROM (SELECT /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,
>(VAL002- mu2) * 1/(sd2) VAL002,
>(VAL003- mu3) * 1/(sd3) VAL003,
>(VAL004- mu4) * 1/(sd4) VAL004,
>(VAL005- mu5) * 1/(sd5) VAL005
> FROM (SELECT x1 VAL001,
>
>  x2 VAL002,
>  x3 VAL003,
>  x4 VAL004,
>  y VAL005
>   FROM cmnt) v3
>
> JOIN (SELECT count(*) c,
>  avg(VAL001) mu1,
>  avg(VAL002) mu2,
>  avg(VAL003) mu3,
>  avg(VAL004) mu4,
>  avg(VAL005) mu5,
>  stddev_pop(VAL001) sd1,
>  stddev_pop(VAL002) sd2,
>  stddev_pop(VAL003) sd3,
>  stddev_pop(VAL004) sd4,
>  stddev_pop(VAL005) sd5
>   FROM (SELECT *
> FROM (SELECT x1 VAL001,
>  x2 VAL002,
>  x3 VAL003,
>  x4 VAL004,
>  y VAL005
>   FROM cmnt) obj1_3) v1) v2) obj1_7) obj1_6;
>
> Also, cmnt in v3 will be used to create the hash table. Seems the part of
> code in converting Join to MapJoin does not play well with this part of
> your original query
>
>
> SELECT *
>  FROM
>(SELECT x1 VAL001,
>x2 VAL002,
>x3 VAL003,
>x4 VAL004,
>y VAL005
> FROM cmnt) obj1_3) v3
>
>
> I have created https://issues.apache.org/jira/browse/HIVE-4968 to address
> this issue.
>
>
>
>
> On Sun, Jul 28, 2013 at 11:46 PM,  wrote:
>
>> Hi:
>> I attach the output of EXPLAIN, and the hive I use is compiled from trunk
>> and my hadoop version is 1.0.1. I use default hive configuration.
>>
>>
>> --
>> wzc1...@gmail.com
>> 已使用 Sparrow 
>>
>> 已使用 Sparrow 
>>
>> 在 2013年7月29日星期一,下午1:08,Yin Huai 写道:
>>
>> Hi,
>>
>> Can you also post the output of EXPLAIN? The execution plan may be
>> helpful to locate the problem.
>>
>> Thanks,
>>
>> Yin
>>
>>
>> On Sun, Jul 28, 2013 at 8:06 PM,  wrote:
>>
>> What I mean by "not pass the testcase in HIVE-4650" is that I compile the
>> trunk code and run the query in HIVE-4650:
>> SELECT *
>> FROM
>>   (SELECT VAL001 x1,
>>   VAL002 x2,
>>   VAL003 x3,
>>   VAL004 x4,
>>   VAL005 y
>>FROM
>>  (SELECT /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002-
>> mu2) * 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) *
>> 1/(sd4) VAL004,(VAL005- mu5) * 1/(sd5) VAL005
>>   FROM
>> (SELECT *
>>  FROM
>>(SELECT x1 VAL001,
>>x2 VAL002,
>>x3 VAL003,
>>x4 VAL004,
>>y VAL005
>> FROM cmnt) obj1_3) v3
>>   JOIN
>> (SELECT count(*) c,
>> avg(VAL001) mu1,
>> avg(VAL002) mu2,
>> avg(VAL003) mu3,
>> avg(VAL004) mu4,
>> avg(VAL005) mu5,
>> stddev_pop(VAL001) sd1,
>> stddev_pop(VAL002) sd2,
>> stddev_pop(VAL003) sd3,
>> stddev_pop(VAL004) sd4,
>> stddev_pop(VAL005) sd5
>>  FROM
>>(SELECT *
>> FROM
>>   (SELECT x1 VAL001,
>>   x2 VAL002,
>>   x3 VAL003,
>>   x4 VAL004,
>>   y VAL005
>>FROM cmnt) obj1_3) v1) v2) obj1_7) obj1_6 ;
>>
>> and it still fail at the same place:
>> …
>> Diagnostic Messages for this Task:
>> java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException:
>> java.lang.NullPointerException
>> at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs

Re: 回复: BUG IN HIVE-4650 seems not fixed

2013-07-31 Thread Yin Huai
Seems it is another problem.
Can you try

SELECT *
FROM (SELECT VAL001 x1,
 VAL002 x2,
 VAL003 x3,
 VAL004 x4,
 VAL005 y
  FROM (SELECT /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,
   (VAL002- mu2) * 1/(sd2) VAL002,
   (VAL003- mu3) * 1/(sd3) VAL003,
   (VAL004- mu4) * 1/(sd4) VAL004,
   (VAL005- mu5) * 1/(sd5) VAL005
FROM (SELECT x1 VAL001,
 x2 VAL002,
 x3 VAL003,
 x4 VAL004,
 y VAL005
  FROM cmnt) v3
JOIN (SELECT count(*) c,
 avg(VAL001) mu1,
 avg(VAL002) mu2,
 avg(VAL003) mu3,
 avg(VAL004) mu4,
 avg(VAL005) mu5,
 stddev_pop(VAL001) sd1,
 stddev_pop(VAL002) sd2,
 stddev_pop(VAL003) sd3,
 stddev_pop(VAL004) sd4,
 stddev_pop(VAL005) sd5
  FROM (SELECT *
FROM (SELECT x1 VAL001,
 x2 VAL002,
 x3 VAL003,
 x4 VAL004,
 y VAL005
  FROM cmnt) obj1_3) v1) v2) obj1_7) obj1_6;

Also, cmnt in v3 will be used to create the hash table. Seems the part of
code in converting Join to MapJoin does not play well with this part of
your original query

SELECT *
 FROM
   (SELECT x1 VAL001,
   x2 VAL002,
   x3 VAL003,
   x4 VAL004,
   y VAL005
FROM cmnt) obj1_3) v3


I have created https://issues.apache.org/jira/browse/HIVE-4968 to address
this issue.




On Sun, Jul 28, 2013 at 11:46 PM,  wrote:

> Hi:
> I attach the output of EXPLAIN, and the hive I use is compiled from trunk
> and my hadoop version is 1.0.1. I use default hive configuration.
>
>
> --
> wzc1...@gmail.com
> 已使用 Sparrow 
>
> 已使用 Sparrow 
>
> 在 2013年7月29日星期一,下午1:08,Yin Huai 写道:
>
> Hi,
>
> Can you also post the output of EXPLAIN? The execution plan may be helpful
> to locate the problem.
>
> Thanks,
>
> Yin
>
>
> On Sun, Jul 28, 2013 at 8:06 PM,  wrote:
>
> What I mean by "not pass the testcase in HIVE-4650" is that I compile the
> trunk code and run the query in HIVE-4650:
> SELECT *
> FROM
>   (SELECT VAL001 x1,
>   VAL002 x2,
>   VAL003 x3,
>   VAL004 x4,
>   VAL005 y
>FROM
>  (SELECT /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002-
> mu2) * 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) *
> 1/(sd4) VAL004,(VAL005- mu5) * 1/(sd5) VAL005
>   FROM
> (SELECT *
>  FROM
>(SELECT x1 VAL001,
>x2 VAL002,
>x3 VAL003,
>x4 VAL004,
>y VAL005
> FROM cmnt) obj1_3) v3
>   JOIN
> (SELECT count(*) c,
> avg(VAL001) mu1,
> avg(VAL002) mu2,
> avg(VAL003) mu3,
> avg(VAL004) mu4,
> avg(VAL005) mu5,
> stddev_pop(VAL001) sd1,
> stddev_pop(VAL002) sd2,
> stddev_pop(VAL003) sd3,
> stddev_pop(VAL004) sd4,
> stddev_pop(VAL005) sd5
>  FROM
>(SELECT *
> FROM
>   (SELECT x1 VAL001,
>   x2 VAL002,
>   x3 VAL003,
>   x4 VAL004,
>   y VAL005
>FROM cmnt) obj1_3) v1) v2) obj1_7) obj1_6 ;
>
> and it still fail at the same place:
> …
> Diagnostic Messages for this Task:
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
> at
> org.apac

Re: 回复: BUG IN HIVE-4650 seems not fixed

2013-07-28 Thread Yin Huai
Hi,

Can you also post the output of EXPLAIN? The execution plan may be helpful
to locate the problem.

Thanks,

Yin


On Sun, Jul 28, 2013 at 8:06 PM,  wrote:

> What I mean by "not pass the testcase in HIVE-4650" is that I compile the
> trunk code and run the query in HIVE-4650:
> SELECT *
> FROM
>   (SELECT VAL001 x1,
>   VAL002 x2,
>   VAL003 x3,
>   VAL004 x4,
>   VAL005 y
>FROM
>  (SELECT /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002-
> mu2) * 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) *
> 1/(sd4) VAL004,(VAL005- mu5) * 1/(sd5) VAL005
>   FROM
> (SELECT *
>  FROM
>(SELECT x1 VAL001,
>x2 VAL002,
>x3 VAL003,
>x4 VAL004,
>y VAL005
> FROM cmnt) obj1_3) v3
>   JOIN
> (SELECT count(*) c,
> avg(VAL001) mu1,
> avg(VAL002) mu2,
> avg(VAL003) mu3,
> avg(VAL004) mu4,
> avg(VAL005) mu5,
> stddev_pop(VAL001) sd1,
> stddev_pop(VAL002) sd2,
> stddev_pop(VAL003) sd3,
> stddev_pop(VAL004) sd4,
> stddev_pop(VAL005) sd5
>  FROM
>(SELECT *
> FROM
>   (SELECT x1 VAL001,
>   x2 VAL002,
>   x3 VAL003,
>   x4 VAL004,
>   y VAL005
>FROM cmnt) obj1_3) v1) v2) obj1_7) obj1_6 ;
>
> and it still fail at the same place:
> …
> Diagnostic Messages for this Task:
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
> at
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
> at
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
> at
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:611)
> at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
> ... 8 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:186)
> ... 14 more
>
> --
> wzc1...@gmail.com
> 已使用 Sparrow 
>
> 已使用 Sparrow 
>
> 在 2013年7月28日星期日,下午8:08,wzc1...@gmail.com 写道:
>
>  hi all:
>
> We are currently testing hive 0.11 against our production environment and
> run into some problems. Some of them are related to the param
> "hive.auto.convert.join".
> We disable this param and some failed testcases passed. By searching in
> hive jira issues I find that the patch in HIVE-4650(
> https://issues.apache.org/jira/browse/HIVE-4650) may be helpful.
> I compile the newest code in trunk and try the failed testcase in
> HIVE-4650, but it doesn't pass. It seems that this issue is not fixed
> while it's closed.
>
> Am I missed something?
>
> --
> wzc1...@gmail.com
> 已使用 Sparrow 
>
> 已使用 Sparrow 
>
>
>