Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14306 )

Change subject: add a tool to create table
......................................................................


Patch Set 20:

(5 comments)

I experimented with JSON serialization and enums, and I learned that you can 
specify an enum as a integer value _or_ as a string. So we should definitely 
use enums throughout, as this allows both the ease of use of specifying a 
string value or the terseness of specifying an integer value, and the enum 
definition itself provides a good source of documentation.

Here's my code:

test.proto:
  syntax = "proto2";

  message Foo {
    enum Color {
      BLUE = 0;
      RED = 1;
      GREEN = 2;
    }
    optional Color c = 1;
  }


test.cc:
  #include <iostream>

  #include <google/protobuf/stubs/status.h>
  #include <google/protobuf/stubs/stringpiece.h>
  #include <google/protobuf/util/json_util.h>

  #include "test.pb.h"

  using namespace std;

  int main(int argc, char* argv[]) {
  if (argc != 2) {
    cout << "Usage: " << argv[0] << " <json>" << endl;
    return 1;
  }
  cout << "Attempting to parse JSON: \'" << argv[1] << "\'" << endl;
  Foo message;
  const auto& google_status = 
google::protobuf::util::JsonStringToMessage(argv[1], &message);
  if (!google_status.ok()) {
    cout << "unable to parse JSON: " << 
google_status.error_message().ToString() << endl;
    return 1;
  }

  cout << message.DebugString();
  return 0;
}

I compiled it with:
- protoc --cpp_out . test.proto
- g++ -o test test.cc test.pb.cc -I. -lprotobuf

Then I ran it a few times:
$ ./test "{ c : 1 }"
Attempting to parse JSON: '{ c : 1 }'
c: RED
$ ./test "{ c : red }"
Attempting to parse JSON: '{ c : red }'
unable to parse JSON: Unexpected token.
{ c : red }
      ^
$ ./test "{ c : 'red' }"
Attempting to parse JSON: '{ c : 'red' }'
c: RED
$ ./test "{ c : 'green' }"
Attempting to parse JSON: '{ c : 'green' }'
c: GREEN
$ ./test "{ c : 'pink' }"
Attempting to parse JSON: '{ c : 'pink' }'
unable to parse JSON: c: invalid value "pink" for type TYPE_ENUM
$ ./test "{ c : 2 }"
Attempting to parse JSON: '{ c : 2 }'
c: GREEN

http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@335
PS18, Line 335:     // GROUP_VARINT encoding is deprecated and no longer 
implemented.
              :     GROUP_VARINT = 3;
Should be removed in new code.


http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@359
PS18, Line 359: Default type is AUTO_ENCODING, the types
              :   // PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE,
              :   // DICT_ENCODING, BIT_SHUFFLE are also supported.
Not necessary; it's obvious from the enum definition.


http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@363
PS18, Line 363: Default type is DEFAULT_COMPRESSION, the
              :   // types NO_COMPRESSION, SNAPPY, LZ4, ZLIB are also supported.
Not necessary; it's obvious from the enum definition.


http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc@910
PS18, Line 910:          column.encoding()));
> Here if the user uses an incorrect value, such as '"encoding" : 100', the v
Right, the way in which we usually handle that is by defining an 
UNKNOWN_ENCODING (or UNKNOWN_COMPRESSION or whatever) enum entry that's first 
in the PB enum. You could set it to a high value (like 999) to ensure that 
users don't accidentally specify it, or just set it to 0. Either way, its 
presence means that invalid values will be set to UNKNOWN_ENCODING, which you 
can then filter for in the code and return an error if you see them.

You could do the same for INCLUSIVE/EXCLUSIVE if you want to catch mistakes 
there too (vs. converting a mistake into INCLUSIVE).

Oh, I understand the problem now: you want these new enum values to match up 
with the values defined in schema.h so you can pass them through 1-1. I'm not 
sure if we should do that; it'll make it difficult to evolve one side without 
also changing the other side.


http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc@1017
PS20, Line 1017:   unique_ptr<KuduTableCreator> 
table_creator(client->NewTableCreator());
> 'table_creator' must be defined behind 'kudu_schema', otherwise in ASAN and
Makes sense; client.h says "  ///   Schema to use. Must remain valid for the 
lifetime of the builder.".



--
To view, visit http://gerrit.cloudera.org:8080/14306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1
Gerrit-Change-Number: 14306
Gerrit-PatchSet: 20
Gerrit-Owner: YangSong <sy1...@yeah.net>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <sy1...@yeah.net>
Gerrit-Comment-Date: Wed, 16 Oct 2019 17:05:12 +0000
Gerrit-HasComments: Yes

Reply via email to