Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
......................................................................


Patch Set 28:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG@59
PS28, Line 59: and the tie-handling
             : semantics required by rank() predicates
nit: I think this was really implemented in your previous patch?


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h
File be/src/exec/topn-node.h:

http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h@64
PS28, Line 64:     int64_t limit = is_partitioned() ? per_partition_limit() :
What's the relationship between 'include_ties' and 'is_partitioned', i.e. why 
does 'include_ties' here matter for the unpartitioned case but not the 
partitioned case?


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@244
PS28, Line 244: U
typo


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@399
PS28, Line 399:     RETURN_IF_ERROR(QueryMaintenance(state));
This results in two calls to QueryMaintenance() in quick succession, here and 
in GetNext(), might be better to avoid that


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@566
PS28, Line 566: be
typo


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@666
PS28, Line 666: vector<unique_ptr<Heap>> rematerialized_heaps;
              :   for (auto& entry : partition_heaps_) {
              :     RETURN_IF_ERROR(entry.second->RematerializeTuples(this, 
state, temp_pool.get()));
              :     DCHECK(entry.second->DCheckConsistency());
              :     // The key references memory in 'tuple_pool_'. Replace it 
with a rematerialized tuple.
              :     rematerialized_heaps.push_back(move(entry.second));
              :   }
              :   partition_heaps_.clear();
              :   for (auto& heap_ptr : rematerialized_heaps) {
              :     const Tuple* key_tuple = heap_ptr->top();
              :     partition_heaps_.emplace(key_tuple, move(heap_ptr));
              :   }
I think this can be put in an 'else' with the above 'if (heap_ != nullptr)' to 
make the partitioned vs. unpartitioned handling clearer


http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift@625
PS28, Line 625:   // If > 0, the rank()/row_number() pushdown into pre-analytic 
sorts is enabled
Maybe note the default value, and briefly the issues with setting it higher.



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Feb 2021 00:25:09 +0000
Gerrit-HasComments: Yes

Reply via email to