Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15758 )

Change subject: [partitioning] KUDU-2671 [part 1] Support for different hash 
partitioning per range partition.
......................................................................


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition-test.cc
File src/kudu/common/partition-test.cc:

http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition-test.cc@864
PS7, Line 864:   // CREATE TABLE t (k INT, f INT),
nit: there's no "f" column (any more?)


http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition-test.cc@877
PS7, Line 877:   vector<PartitionSchema::HashBucketSchema> hash_buckets1;
             :   
hash_buckets1.emplace_back(PartitionSchema::HashBucketSchema{{ColumnId(0)}, 2, 
0});
             :   vector<PartitionSchema::HashBucketSchema> hash_buckets2;
             :   
hash_buckets2.emplace_back(PartitionSchema::HashBucketSchema{{ColumnId(0)}, 3, 
0});
             :   boost::optional<PartitionSchema::RangeHashBuckets> 
range_hash_buckets = {
> Done
Thanks! Did you mean to push review feedback from this patch into the part 2 
patch? I see most of the comments addressed there.


http://gerrit.cloudera.org:8080/#/c/15758/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/15758/3/src/kudu/common/partition.cc@368
PS3, Line 368:
> Here we compute hash partitions for "outer" hash
Could you also add a test for the case where `hash_bucket_schemas_` is 
specified?

One thing that I'd like to hammer down before committing, this: why do we want 
this example to generate 8 buckets instead of 2? If we completely ignored 
`hash_bucket_schemas_` when `range_hash_buckets` is specified, users would have 
more flexibility to change schemas. And `hash_bucket_schemas_` could be used as 
a default in case `range_hash_buckets` isn't specified in a later `ADD RANGE 
PARTITION` statement.


http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition.cc@415
PS7, Line 415:       // Copy all partitions created in Step 1 above
nit: I think we can be a bit more descriptive with what we're doing here, maybe 
something like:

"We'll combine the hash bucketing schemas for this range into 
'all_hash_partitions', starting with the base set of partitions generated 
above."

That way it's more explicitly expressed to readers why this logic is coalescing 
into 'all_hash_partitions'.


http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition.cc@419
PS7, Line 419:         // Create a partition per range bound and hash bucket 
combination.
nit: maybe, "Combine each bucket from this bucket schema with every partition 
we've generated in this range so far."


http://gerrit.cloudera.org:8080/#/c/15758/7/src/kudu/common/partition.cc@432
PS7, Line 432:         all_hash_partitions.swap(new_partitions);
> That's correct. Multiple hash schemas produce combination.
Ah, I misinterpreted what these loops were doing. Thanks for clarifying! I'll 
leave some nits for some comments that I think might help emphasize this 
combinatorial logic.



--
To view, visit http://gerrit.cloudera.org:8080/15758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie843f37392240c3a47a589658d9702c2049ee011
Gerrit-Change-Number: 15758
Gerrit-PatchSet: 7
Gerrit-Owner: Volodymyr Verovkin <verjov...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Volodymyr Verovkin <verjov...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 May 2020 05:10:47 +0000
Gerrit-HasComments: Yes

Reply via email to