Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18604 )
Change subject: [Tools] Support to config hash bucket numbers when copy a table ...................................................................... Patch Set 6: (9 comments) http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@10 PS6, Line 10: config configure http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@12 PS6, Line 12: may might http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@13 PS6, Line 13: the old table the table http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@12 PS6, Line 12: stored with large : of data in it contained a lot of data http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@14 PS6, Line 14: to store large of data drop this part http://gerrit.cloudera.org:8080/#/c/18604/6//COMMIT_MSG@14 PS6, Line 14: And there no method to change the number : of hash bucket when the table has already been created And there isn't a way to change the number of hush buckets in the partition schema of an already existing table. http://gerrit.cloudera.org:8080/#/c/18604/6/src/kudu/tools/table_scanner.cc File src/kudu/tools/table_scanner.cc: http://gerrit.cloudera.org:8080/#/c/18604/6/src/kudu/tools/table_scanner.cc@433 PS6, Line 433: int bucket_num = 0; : bool is_number = true; nit: consider moving these variables where they belong -- inside the 'for()' cycle below. http://gerrit.cloudera.org:8080/#/c/18604/6/src/kudu/tools/table_scanner.cc@437 PS6, Line 437: nit: misaligned indent http://gerrit.cloudera.org:8080/#/c/18604/6/src/kudu/tools/table_scanner.cc@450 PS6, Line 450: int i = 0; : for (const auto& hash_dimension : partition_schema.hash_schema()) { : int num_buckets = hash_bucket_nums[i] != -1 ? hash_bucket_nums[i] : : hash_dimension.num_buckets; : auto hash_columns = convert_column_ids_to_names(hash_dimension.column_ids); : table_creator->add_hash_partitions(hash_columns, : num_buckets, : hash_dimension.seed); : i++; : } What if the number of hash buckets specified in the command line doesn't match the number of hash dimensions in the table? Say, the number of the specified dimensions in the command line is greater than the actual number of hash buckets? Should there be an error reported? Also, it would be great to add check for the number of hash buckets specified: it must be greater or equal than 2. -- To view, visit http://gerrit.cloudera.org:8080/18604 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cec38e5ea09c66bfed20622b85033602da60d41 Gerrit-Change-Number: 18604 Gerrit-PatchSet: 6 Gerrit-Owner: Wang Xixu <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Mon, 27 Jun 2022 14:24:26 +0000 Gerrit-HasComments: Yes
