[Impala-ASF-CR] IMPALA-11430: Support custom hash schema for Kudu range tables

Alexey Serbin (Code Review) Thu, 14 Jul 2022 15:06:30 -0700

Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18676 )


Change subject: IMPALA-11430: Support custom hash schema for Kudu range tables
......................................................................


Patch Set 11:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18676/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18676/11//COMMIT_MSG@11
PS11, Line 11: to
nit: drop duplicate 'to'


http://gerrit.cloudera.org:8080/#/c/18676/11//COMMIT_MSG@23
PS11, Line 23: ,
I haven't yet looked into the rest of the code, but I'm curious: is the syntax 
without these commas separating hash dimensions now supported for CREATE TABLE 
as well?  If yes, maybe mention that in the description.


http://gerrit.cloudera.org:8080/#/c/18676/11/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test:

http://gerrit.cloudera.org:8080/#/c/18676/11/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test@324
PS11, Line 324: hash partitions 10
Is this a valid syntax at all?  What hash schema was assumed for the new range 
if the varying number of hash dimensions was supported?  I see that the 
table-wide hash schema has two dimension, and the assumed one-dimensional hash 
schema for the new range doesn't have hash columns specified.


http://gerrit.cloudera.org:8080/#/c/18676/11/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test@347
PS11, Line 347: '30 <= VALUES < 40 HASH(id) PARTITIONS 2 HASH(c2) PARTITIONS 3'
Does it makes sense to add a range with hash schema with reverse order of hash 
dimensions to make sure it's output as expected by 'SHOW HASH SCHEMA'?

Something like
  40 <= VALUES < 50 HASH(c2) PARTITIONS 5 HASH(id) PARTITIONS 3


http://gerrit.cloudera.org:8080/#/c/18676/11/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test@377
PS11, Line 377: '30 <= VALUES < 40 HASH(id,c2) PARTITIONS 2'
Isn't it supposed to be the table-wide hash schema in this case, i.e. 'HASH(id) 
PARTITIONS 3' instead?  I see the corresponding ALTER statement mentions just 
the number of hash partitions, so I'd expect that syntax would be either 
rejected or attributed to the table-wide hash schema:

  alter table custom_hash_range_single add range partition 30 <= values < 40 
hash partitions 2


http://gerrit.cloudera.org:8080/#/c/18676/11/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test@380
PS11, Line 380: ====
Does it make sense to also add the following cases to check how the error is 
reported back to the user:

  * ALTER TABLE custom_hash_range_single ADD RANGE PARTITION -400 <= VALUES < 
-300 HASH(id, c2) PARTITIONS 1
  * ALTER TABLE custom_hash_range_single ADD RANGE PARTITION -300 <= VALUES < 
-200 HASH(id, c2) PARTITIONS 0
  * ALTER TABLE custom_hash_range_single ADD RANGE PARTITION -200 <= VALUES < 
-100 HASH(id, c2) PARTITIONS -1



--
To view, visit http://gerrit.cloudera.org:8080/18676
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I981056e0827f4957580706d6e73742e4e6743c1c
Gerrit-Change-Number: 18676
Gerrit-PatchSet: 11
Gerrit-Owner: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Thu, 14 Jul 2022 22:02:08 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11430: Support custom hash schema for Kudu range tables

Reply via email to