Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17779 )

Change subject: KUDU-2671 update range partitioning with custom hash schema
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17779/4/src/kudu/client/client.cc
File src/kudu/client/client.cc:

http://gerrit.cloudera.org:8080/#/c/17779/4/src/kudu/client/client.cc@1020
PS4, Line 1020: range->has_table_wide_hash_schema_
Just curious, maybe i'm missing something but is there a difference between 
using this and checking if hash_schema_ is empty?


http://gerrit.cloudera.org:8080/#/c/17779/4/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/17779/4/src/kudu/common/common.proto@353
PS4, Line 353:   // This data structure represents a range partition with a 
custom hash schema.
             :   message RangeWithHashSchemaPB {
             :     // Row operations containing the lower and upper range bound 
for the range.
             :     optional RowOperationsPB range_bounds = 1;
             :     // Hash schema for the range.
             :     repeated HashBucketSchemaPB hash_schema = 2;
             :   }
I wonder if it makes sense to decouple the idea of ranges and tablets a bit. 
Namely, if we want 12 ranges, one per month, and we wanted to have 10 months 
with 1 hash bucket, and 2 months with 3 hash buckets, I wonder how feasible it 
is to store that as two ranges in the partition schema: one range spanning 10 
months, the other spanning 2 months. We may end up saving some on storage of 
such sparsely modified partition definitions.

I suppose such optimizations/compressions could be performed by the master, if 
there's any concern about the size of the schema we send to clients.

That said, I don't love the idea of having range bounds refer to both the upper 
and lower bound of a given tablet, and the upper and lower bound of multiple 
tablets that share the same hash schema.



--
To view, visit http://gerrit.cloudera.org:8080/17779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37aae56a33170894f30d6cd73a5698d6cbb7a697
Gerrit-Change-Number: 17779
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Comment-Date: Sat, 28 Aug 2021 07:00:49 +0000
Gerrit-HasComments: Yes

Reply via email to