[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127043#comment-15127043 ] Sergey Shelukhin commented on HIVE-12947: - Not committing to branch-1? > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch, > HIVE-12947.3.patch, HIVE-12947.4.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126878#comment-15126878 ] Vikram Dixit K commented on HIVE-12947: --- The Hbase test passes locally: Running org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.936 sec - in org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration I just updated the results for the smb_cache tests. > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch, > HIVE-12947.3.patch, HIVE-12947.4.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126764#comment-15126764 ] Sergey Shelukhin commented on HIVE-12947: - smb_cache test probably failed due to one of the recent CBO-related changes (user-level explain, strict mode or removal of the strict check), it just needs an update before commit. TestHBaseAggrStatsCacheIntegration needs to be checked locally... > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch, > HIVE-12947.3.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124976#comment-15124976 ] Hive QA commented on HIVE-12947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12785251/HIVE-12947.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10042 tests executed *Failed tests:* {noformat} TestHBaseAggrStatsCacheIntegration - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_smb_cache org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6799/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6799/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6799/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12785251 - PreCommit-HIVE-TRUNK-Build > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch, > HIVE-12947.3.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecor
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124033#comment-15124033 ] Sergey Shelukhin commented on HIVE-12947: - +1 > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122118#comment-15122118 ] Sergey Shelukhin commented on HIVE-12947: - Would it be possible to not cache the plan under these conditions, or avoid modifying it, rather than nuking the entire cache? The assumption behind plan cache is that plan should not be modified. > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122099#comment-15122099 ] Hive QA commented on HIVE-12947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12784748/HIVE-12947.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10037 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6780/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6780/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6780/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12784748 - PreCommit-HIVE-TRUNK-Build > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122079#comment-15122079 ] Vikram Dixit K commented on HIVE-12947: --- It is the latter, the plan is modified in the container. It looks like we need to detect that the modification of the plan has already occurred in case of retrieving from cache and not try to re-modify the plan. > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120392#comment-15120392 ] Sergey Shelukhin commented on HIVE-12947: - Actually, that's a good point. I didn't realize that was THE cache. It will prevent plan reuse from working completely, as far as I can tell, maybe other things. "we try to reconnect the work items and fail." - why does it fail specifically, does it get a wrong item from cache? Is plan modified in the container? > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120385#comment-15120385 ] Vikram Dixit K commented on HIVE-12947: --- Added more description on the cause of the issue. I guess my question to you is that is it okay to clear the cache of all the keys in case of SMB? Is there any assumption on the reuse of cache by a different task (that is not doing the join)? > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > SMB join in tez has multiple work items that are connected based on input tag > followed by input initialization etc. In case of container re-use, what ends > up happening is that we try to reconnect the work items and fail. If we try > to work around that issue by recognizing somehow that the cache was in play, > we will run into other initialization issues with respect to record readers. > So the plan is to disable caching of the SMB work items by clearing out > during the close phase. > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
[ https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120368#comment-15120368 ] Sergey Shelukhin commented on HIVE-12947: - Can you elaborate on the root cause? As such, the patch looks good, +1 pending tests > SMB join in tez has ClassCastException when container reuse is on > - > > Key: HIVE-12947 > URL: https://issues.apache.org/jira/browse/HIVE-12947 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Critical > Attachments: HIVE-12947.1.patch > > > {code} > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to > org.apache.hadoop.hive.ql.exec.DummyStoreOperator > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)