[ https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546510#comment-16546510 ]
Hive QA commented on HIVE-17896: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931901/HIVE-17896.12.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 14668 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_struct_type_vectorization] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_struct_type_vectorization] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby] (batchId=179) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_decimal64_reader] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_cast_constant] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_2] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_limit] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_reduce_groupby_decimal] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_string_concat] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_limit] (batchId=165) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query10] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query15] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query26] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query27] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query35] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query37] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query40] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query43] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query49] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query50] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query5] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query60] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query66] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query69] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query76] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query77] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query7] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query80] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query82] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query8] (batchId=262) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query99] (batchId=262) org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerWithOldConf.testMetaConfNotifyListenersClosingClient (batchId=227) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12653/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12653/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12653/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 48 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12931901 - PreCommit-HIVE-Build > TopNKey: Create a standalone vectorizable TopNKey operator > ---------------------------------------------------------- > > Key: HIVE-17896 > URL: https://issues.apache.org/jira/browse/HIVE-17896 > Project: Hive > Issue Type: New Feature > Components: Operators > Affects Versions: 3.0.0 > Reporter: Gopal V > Assignee: Teddy Choi > Priority: Major > Attachments: HIVE-17896.1.patch, HIVE-17896.10.patch, > HIVE-17896.11.patch, HIVE-17896.12.patch, HIVE-17896.3.patch, > HIVE-17896.4.patch, HIVE-17896.5.patch, HIVE-17896.6.patch, > HIVE-17896.7.patch, HIVE-17896.8.patch, HIVE-17896.9.patch > > > For TPC-DS Query27, the TopN operation is delayed by the group-by - the > group-by operator buffers up all the rows before discarding the 99% of the > rows in the TopN Hash within the ReduceSink Operator. > The RS TopN operator is very restrictive as it only supports doing the > filtering on the shuffle keys, but it is better to do this before breaking > the vectors into rows and losing the isRepeating properties. > Adding a TopN Key operator in the physical operator tree allows the following > to happen. > GBY->RS(Top=1) > can become > TNK(1)->GBY->RS(Top=1) > So that, the TopNKey can remove rows before they are buffered into the GBY > and consume memory. > Here's the equivalent implementation in Presto > https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 > Adding this as a sub-feature of GroupBy prevents further optimizations if the > GBY is on keys "a,b,c" and the TopNKey is on just "a". -- This message was sent by Atlassian JIRA (v7.6.3#76005)