[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-30 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710499#comment-15710499
 ] 

Sergey Shelukhin commented on HIVE-15278:
-

[~hagleitn] ping, does +1 still stand?

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702903#comment-15702903
 ] 

Sergey Shelukhin commented on HIVE-15278:
-

Yes, we make 2 assumptions:
1) That it won't try to pump more records thru the big table side, which won't 
work in any way; logically, it makes no sense cause the big table side is the 
one that's causing the operators to get closed in the first place, so it should 
be done with all records.
2) Main table side is closed first. That is true now; reduceWork vs 
mergeWorkList in ReduceRecordProducer.

I am not sure if we can add a test. Repro that we have is too specific (and 
large potentially) for q files and this code is too much of a mess to repro 
with a unit test.

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702566#comment-15702566
 ] 

Gunther Hagleitner commented on HIVE-15278:
---

LGTM +1. This does look like it'd be painful to debug. Is it possible to add a 
small test to avoid this debug pain for the next person?

One thing I'm not completely sure of: The bug is that the join operator is 
trying to pump records through it's parents after they have been closed. It's 
doing that to finish the last pending group when the first of it's parents is 
closed. Your fix finishes the group after the first parent is closed not the 
last - do you know for a fact that the join operator won't try to push records 
through that (closed) parent? (I think that's the case because it's the big 
table side and all remaining records should be from other branches).

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15692201#comment-15692201
 ] 

Hive QA commented on HIVE-15278:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840352/HIVE-15278.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10733 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=90)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2277/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2277/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2277/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840352 - PreCommit-HIVE-Build

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)