Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14306 )
Change subject: add a tool to create table ...................................................................... Patch Set 5: (20 comments) I haven't reviewed the new tests yet, but I glanced at them briefly and it looks like there's a lot of potential to reuse code rather than repeating the same code over and over for each create table. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/client/schema.cc File src/kudu/client/schema.cc: http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/client/schema.cc@583 PS5, Line 583: const std::string& type) { Indentation of the continuation still isn't right. Please review the Google Style Guide link I dropped into a comment in an earlier revision. http://gerrit.cloudera.org:8080/#/c/14306/3/src/kudu/tools/tool.proto File src/kudu/tools/tool.proto: http://gerrit.cloudera.org:8080/#/c/14306/3/src/kudu/tools/tool.proto@346 PS3, Line 346: message BoundPB { > message RangeBoundPB { I see what you mean: if we use an enum, users will have to specify a numeric value instead of the friendlier "inclusive" or "exclusive" strings. There's a trade-off though: if we go the string route, you'll need to write more validation code to ensure the string values are one of two correct ones. Furthermore, the schema itself won't really tell users what the legal values are (without a comment anyway). So I think enum is still the way to go. We can change it later if it proves to be too confusing. http://gerrit.cloudera.org:8080/#/c/14306/3/src/kudu/tools/tool.proto@378 PS3, Line 378: > After removing "require" and "optional", there are errors reported at compi Right, although we're using PB3, we're still using proto2 syntax (see L17), which mandates required, optional, or repeated for each field. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool.proto File src/kudu/tools/tool.proto: http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool.proto@330 PS5, Line 330: message ColumnPB { Need to include precision and scale for DECIMAL. Should probably include block size too. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool.proto@341 PS5, Line 341: enum Type { : EXCLUSIVE = 0; : INCLUSIVE = 1; : } This definition can be scoped to BoundPB's definition. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool.proto@366 PS5, Line 366: message HashPartitionPB { We should also support specifying a hash seed. See add_hash_partitions(). http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool.proto@381 PS5, Line 381: message CreateTablePB { We should support replication factor, extra configs, and dimension label. See the KuduTableCreator class for details. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc File src/kudu/tools/tool_action_table.cc: http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@162 PS5, Line 162: Global non-POD types are a bad idea in C++ as their initialization order is undefined (leading to problems in library code). Instead of this, how about declaring them statically as before, in new functions that take a string as an argument and return either the EncodingType or CompressionType? http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@517 PS5, Line 517: KuduColumnSchema::DataType>>& range_colums, This is a continuation of L516 so should be indented some more (aligned with 'string' ideally). http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@531 PS5, Line 531: if (values.size() != range_colums.size()) { Should be called range_columns. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@754 PS5, Line 754: case KuduColumnSchema::DataType::DECIMAL: We should probably address this limitation as it's perfectly legal for DECIMAL columns to have default values. See KuduValue::FromDecimal for details. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@887 PS5, Line 887: "please use json_object or json_file flag to create a table")); Needs to be updated. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@931 PS5, Line 931: if (range_keys.find(column.column_name()) != range_keys.end()) { Should use ContainsKey() from gutil/map-util.h. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@937 PS5, Line 937: ParseValueOfType(column.default_value(), Wrap in RETURN_NOT_OK. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@938 PS5, Line 938: KuduColumnSchema::StringToDataType(column.column_type()), &value); Indentation; see the GSG link. Below too. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@939 PS5, Line 939: if (value == nullptr) { This isn't possible; if value was null, ParseValueOfType would have returned an error. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@943 PS5, Line 943: google_status.error_message().ToString()); This isn't relevant to this error. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@964 PS5, Line 964: KuduPartialRow* lower_bound = kudu_schema.NewRow(); : KuduPartialRow* upper_bound = kudu_schema.NewRow(); These should be stored in unique_ptrs and then released into add_range_partition, to avoid leaking memory in the RETURN_NOT_OK code paths. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@970 PS5, Line 970: for (const auto& value : bound.lower_bound().bound_values()) { : values += value; : values += ","; : } You should be able to use a function from gutil/strings/join.h here and below. http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@1219 PS5, Line 1219: .AddRequiredParameter({ kCreateTableJSONArg, "JOSN object or file for creating table" }) It's just a JSON object though; if the user passed a file, they did it via shell redirection and by the time the CLI gets its hands on it it's just raw JSON data and no file. -- To view, visit http://gerrit.cloudera.org:8080/14306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1 Gerrit-Change-Number: 14306 Gerrit-PatchSet: 5 Gerrit-Owner: YangSong <sy1...@yeah.net> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: YangSong <sy1...@yeah.net> Gerrit-Comment-Date: Fri, 27 Sep 2019 23:54:22 +0000 Gerrit-HasComments: Yes