[ https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292194#comment-16292194 ]
Hive QA commented on HIVE-17396: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12902154/HIVE-17396.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 11527 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[quotedid_smb] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=160) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=178) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_constprog_dpp] (batchId=179) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning] (batchId=177) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2] (batchId=179) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_4] (batchId=179) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=178) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=177) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_create_no_grant] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_publisher_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_10] (batchId=138) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketsortoptimize_insert_7] (batchId=128) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=113) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8255/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8255/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8255/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12902154 - PreCommit-HIVE-Build > Support DPP with map joins where the source and target belong in the same > stage > ------------------------------------------------------------------------------- > > Key: HIVE-17396 > URL: https://issues.apache.org/jira/browse/HIVE-17396 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Janaki Lahorani > Assignee: Janaki Lahorani > Attachments: HIVE-17396.1.patch > > > When the target of a partition pruning sink operator is in not the same as > the target of hash table sink operator, both source and target gets scheduled > within the same spark job, and that can result in File Not Found Exception. > HIVE-17225 has a fix to disable DPP in that scenario. This JIRA is to > support DPP for such cases. > Test Case: > SET hive.spark.dynamic.partition.pruning=true; > SET hive.auto.convert.join=true; > SET hive.strict.checks.cartesian.product=false; > CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int); > CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int); > CREATE TABLE reg_table (col int); > ALTER TABLE part_table1 ADD PARTITION (part1_col = 1); > ALTER TABLE part_table2 ADD PARTITION (part2_col = 1); > ALTER TABLE part_table2 ADD PARTITION (part2_col = 2); > INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1); > INSERT INTO TABLE part_table2 PARTITION (part2_col = 1) VALUES (1); > INSERT INTO TABLE part_table2 PARTITION (part2_col = 2) VALUES (2); > INSERT INTO table reg_table VALUES (1), (2), (3), (4), (5), (6); > EXPLAIN SELECT * > FROM part_table1 pt1, > part_table2 pt2, > reg_table rt > WHERE rt.col = pt1.part1_col > AND pt2.part2_col = pt1.part1_col; > Plan: > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > #### A masked pattern was here #### > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: pt1 > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: col (type: int), part1_col (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > Select Operator > expressions: _col1 (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > Target column: part2_col (int) > partition key expr: part2_col > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > target work: Map 2 > Local Work: > Map Reduce Local Work > Map 2 > Map Operator Tree: > TableScan > alias: pt2 > Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: col (type: int), part2_col (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 2 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > Local Work: > Map Reduce Local Work > Stage: Stage-1 > Spark > #### A masked pattern was here #### > Vertices: > Map 3 > Map Operator Tree: > TableScan > alias: rt > Statistics: Num rows: 6 Data size: 6 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: col is not null (type: boolean) > Statistics: Num rows: 6 Data size: 6 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: col (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 6 Data size: 6 Basic stats: > COMPLETE Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > Inner Join 0 to 2 > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4 > input vertices: > 0 Map 1 > 1 Map 2 > Statistics: Num rows: 13 Data size: 13 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 13 Data size: 13 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Local Work: > Map Reduce Local Work > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink -- This message was sent by Atlassian JIRA (v6.4.14#64029)