Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14306 )
Change subject: add a tool to create table ...................................................................... Patch Set 20: (5 comments) I experimented with JSON serialization and enums, and I learned that you can specify an enum as a integer value _or_ as a string. So we should definitely use enums throughout, as this allows both the ease of use of specifying a string value or the terseness of specifying an integer value, and the enum definition itself provides a good source of documentation. Here's my code: test.proto: syntax = "proto2"; message Foo { enum Color { BLUE = 0; RED = 1; GREEN = 2; } optional Color c = 1; } test.cc: #include <iostream> #include <google/protobuf/stubs/status.h> #include <google/protobuf/stubs/stringpiece.h> #include <google/protobuf/util/json_util.h> #include "test.pb.h" using namespace std; int main(int argc, char* argv[]) { if (argc != 2) { cout << "Usage: " << argv[0] << " <json>" << endl; return 1; } cout << "Attempting to parse JSON: \'" << argv[1] << "\'" << endl; Foo message; const auto& google_status = google::protobuf::util::JsonStringToMessage(argv[1], &message); if (!google_status.ok()) { cout << "unable to parse JSON: " << google_status.error_message().ToString() << endl; return 1; } cout << message.DebugString(); return 0; } I compiled it with: - protoc --cpp_out . test.proto - g++ -o test test.cc test.pb.cc -I. -lprotobuf Then I ran it a few times: $ ./test "{ c : 1 }" Attempting to parse JSON: '{ c : 1 }' c: RED $ ./test "{ c : red }" Attempting to parse JSON: '{ c : red }' unable to parse JSON: Unexpected token. { c : red } ^ $ ./test "{ c : 'red' }" Attempting to parse JSON: '{ c : 'red' }' c: RED $ ./test "{ c : 'green' }" Attempting to parse JSON: '{ c : 'green' }' c: GREEN $ ./test "{ c : 'pink' }" Attempting to parse JSON: '{ c : 'pink' }' unable to parse JSON: c: invalid value "pink" for type TYPE_ENUM $ ./test "{ c : 2 }" Attempting to parse JSON: '{ c : 2 }' c: GREEN http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto File src/kudu/tools/tool.proto: http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@335 PS18, Line 335: // GROUP_VARINT encoding is deprecated and no longer implemented. : GROUP_VARINT = 3; Should be removed in new code. http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@359 PS18, Line 359: Default type is AUTO_ENCODING, the types : // PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE, : // DICT_ENCODING, BIT_SHUFFLE are also supported. Not necessary; it's obvious from the enum definition. http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@363 PS18, Line 363: Default type is DEFAULT_COMPRESSION, the : // types NO_COMPRESSION, SNAPPY, LZ4, ZLIB are also supported. Not necessary; it's obvious from the enum definition. http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc File src/kudu/tools/tool_action_table.cc: http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc@910 PS18, Line 910: column.encoding())); > Here if the user uses an incorrect value, such as '"encoding" : 100', the v Right, the way in which we usually handle that is by defining an UNKNOWN_ENCODING (or UNKNOWN_COMPRESSION or whatever) enum entry that's first in the PB enum. You could set it to a high value (like 999) to ensure that users don't accidentally specify it, or just set it to 0. Either way, its presence means that invalid values will be set to UNKNOWN_ENCODING, which you can then filter for in the code and return an error if you see them. You could do the same for INCLUSIVE/EXCLUSIVE if you want to catch mistakes there too (vs. converting a mistake into INCLUSIVE). Oh, I understand the problem now: you want these new enum values to match up with the values defined in schema.h so you can pass them through 1-1. I'm not sure if we should do that; it'll make it difficult to evolve one side without also changing the other side. http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc File src/kudu/tools/tool_action_table.cc: http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc@1017 PS20, Line 1017: unique_ptr<KuduTableCreator> table_creator(client->NewTableCreator()); > 'table_creator' must be defined behind 'kudu_schema', otherwise in ASAN and Makes sense; client.h says " /// Schema to use. Must remain valid for the lifetime of the builder.". -- To view, visit http://gerrit.cloudera.org:8080/14306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1 Gerrit-Change-Number: 14306 Gerrit-PatchSet: 20 Gerrit-Owner: YangSong <sy1...@yeah.net> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: YangSong <sy1...@yeah.net> Gerrit-Comment-Date: Wed, 16 Oct 2019 17:05:12 +0000 Gerrit-HasComments: Yes