[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-09-26 Thread Adar Dembo (Code Review)
Hello Tidy Bot, Alexey Serbin, Dan Burkert, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7853

to look at the new patch set (#4).

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..

WIP: use C++ ExternalMiniCluster for Java and Python tests

Maintaining Kudu clients across various languages has been an ongoing
maintenance burden. Even when the client is just a thin wrapper around
another client (e.g. Kudu Python bindings), a great deal of work goes into
client testability. In practice, this has meant a bespoke mini cluster
implementation for each language. On the surface this doesn't seem that bad;
we just need to spawn some masters and tservers, right? Well, the work
quickly adds up:

o While the C++ mini cluster is heavily used and has seen many improvements,
  the Java mini cluster has not received the same kind of love, and is less
  robust as a result. KUDU-1976 is a great example of this deficiency.
o With the inclusion of authn came the addition of a "mini KDC", a special
  daemon for Kerberized mini clusters. It was originally implemented in C++
  and ported to Java, but has yet to be ported to the Python client; this is
  one of the obstacles towards porting full authn support to Python.
o Dan has been prototyping Hive Metastore and Sentry integration for Kudu,
  the testing of which will require "mini HMS" and possibly "mini Sentry"
  testing implementations in C++, Java, and eventually, Python.

In sum, good support for non-C++ mini clusters is an ongoing commitment and
requires a great deal of work. This work hasn't always been forthcoming, and
the non-C++ clusters are deficient as a result. But it doesn't have to be
this way! Here's a thought: what if we reused the C++ mini cluster for tests
written in these other languages? We could write a "proxy" application whose
job it is to manage the C++ mini cluster and expose a rudimentary API that's
easily programmable from Java and Python.

This patch attempts to do just that. It adds a "shell" mode to the Kudu CLI
which provides a rudimentary control shell that can be used to spin up an
ExternalMiniCluster. The shell is controlled via a wire protocol over
stdin/stdout. The first cut of the protocol was sh-like, with simple
word-based commands. It was then revised into a PB-based protocol (with
optional JSON encoding) based on feedback. As a proof of concept, the patch
also replaces the bespoke Java mini cluster with callouts to the new shell.

I should add that I like the idea of shipping "shell" into production as
part of the CLI, as it helps realize the vision of a single Kudu artifact
that can provide Kudu testability for any integrating product.

WIP because, well, it should be pretty obvious. I was able to get through a
full run of "mvn verify" locally, so I have confidence that this can work.
But I'd like to solicit feedback on the general approach before spending
more time applying spit and polish.

Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
---
M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java
D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java
D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java
D java/kudu-client/src/test/resources/flags
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/tool.proto
M src/kudu/tools/tool_action.h
A src/kudu/tools/tool_action_test.cc
M src/kudu/tools/tool_main.cc
17 files changed, 919 insertions(+), 1,256 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/4
--
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-Change-Number: 7853
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-09-26 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7853 )

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG@47
PS2, Line 47:
: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on
> I'm on board with protobuf for schema / json as encoded format as well.
OK, PS3 changes the interface to use a protobuf-based schema with an optional 
JSON serialization. The former provides more type safety for clients who are 
willing to eat the protobuf dependency and the latter for those who aren't.



--
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-Change-Number: 7853
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 27 Sep 2017 02:13:20 +
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-09-26 Thread Adar Dembo (Code Review)
Hello Tidy Bot, Alexey Serbin, Dan Burkert, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7853

to look at the new patch set (#3).

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..

WIP: use C++ ExternalMiniCluster for Java and Python tests

Maintaining Kudu clients across various languages has been an ongoing
maintenance burden. Even when the client is just a thin wrapper around
another client (e.g. Kudu Python bindings), a great deal of work goes into
client testability. In practice, this has meant a bespoke mini cluster
implementation for each language. On the surface this doesn't seem that bad;
we just need to spawn some masters and tservers, right? Well, the work
quickly adds up:

o While the C++ mini cluster is heavily used and has seen many improvements,
  the Java mini cluster has not received the same kind of love, and is less
  robust as a result. KUDU-1976 is a great example of this deficiency.
o With the inclusion of authz came the addition of a "mini KDC", a special
  daemon for Kerberized mini clusters. It was originally implemented in C++
  and ported to Java, but has yet to be ported to the Python client; this is
  one of the obstacles towards porting full authz support to Python.
o Dan has been prototyping Hive Metastore and Sentry integration for Kudu,
  the testing of which will require "mini HMS" and possibly "mini Sentry"
  testing implementations in C++, Java, and eventually, Python.

In sum, good support for non-C++ mini clusters is an ongoing commitment and
requires a great deal of work. This work hasn't always been forthcoming, and
the non-C++ clusters are deficient as a result. But it doesn't have to be
this way! Here's a thought: what if we reused the C++ mini cluster for tests
written in these other languages? We could write a "proxy" application whose
job it is to manage the C++ mini cluster and expose a rudimentary API that's
easily programmable from Java and Python.

This patch attempts to do just that. It adds a "shell" mode to the Kudu CLI
which provides a rudimentary control shell that can be used to spin up an
ExternalMiniCluster. The shell is controlled via a wire protocol over
stdin/stdout. The first cut of the protocol was sh-like, with simple
word-based commands. It was then revised into a PB-based protocol (with
optional JSON encoding) based on feedback. As a proof of concept, the patch
also replaces the bespoke Java mini cluster with callouts to the new shell.

I should add that I like the idea of shipping "shell" into production as
part of the CLI, as it helps realize the vision of a single Kudu artifact
that can provide Kudu testability for any integrating product.

WIP because, well, it should be pretty obvious. I was able to get through a
full run of "mvn verify" locally, so I have confidence that this can work.
But I'd like to solicit feedback on the general approach before spending
more time applying spit and polish.

Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
---
M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java
D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java
D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java
D java/kudu-client/src/test/resources/flags
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/tool.proto
M src/kudu/tools/tool_action.h
A src/kudu/tools/tool_action_test.cc
M src/kudu/tools/tool_main.cc
17 files changed, 919 insertions(+), 1,256 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/3
--
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-Change-Number: 7853
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-09-01 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 36: run_cluster
bikesheddy, but I'd prefer 'kudu mini_cluster'.  Mini cluster is a pretty well 
established term of art.


PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on the general approach before 
spending
: more time applying spit and polish.
> Would it be possible to use JSON format as an internal format  of the tool,
I'm on board with protobuf for schema / json as encoded format as well.


http://gerrit.cloudera.org:8080/#/c/7853/2/java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
File java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java:

Line 83: if (responses.size() == 0) {
Is this possible?  I would expect not due to the do part of do/while.


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on the general approach before 
spending
: more time applying spit and polish.
> I went back and forth on this.
Would it be possible to use JSON format as an internal format  of the tool, 
while translating the command-line arguments into a JSON document which would 
be consumed internally?  Basically, it's about a shim layer which translates 
the command-line arguments into a JSON representation.

Vice versa: it could be the same for the output.  As for the output, there are 
libraries like libxo to handle that in a uniform way:
  https://github.com/Juniper/libxo


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(2 comments)

> I think it's a great idea.  What about more machine-oriented
 > interface for the CLI tool?  Do you expect it to be transformed
 > into something JSON-like in the nearest future?
 
See the comment Todd left; we're discussing just that.

 > Maybe, it's worth introducing running a proxy along with
 > minicluster and providing something like REST interface instead of
 > CLI for the tests?

I didn't implement a TCP-based connection between the proxy and the tests 
exactly so that the entire "port already in use" class of issues can be 
avoided. When communication is over a TCP socket, we either need to use a 
well-known port, which is prone to conflicts, or an ephemeral port, whose 
number needs to be communicated back to the tests. If we've already got a 
channel for communicating the port number (probably stdout), we can use that 
channel for control too.

I think a socket-based approach would make more sense if there were multiple 
consumers of a single mini cluster, but that's just not the case with our tests.

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 20: authz
> nit here and below: I think it should be 'authn' -- the kerberos-related ac
You are right, my bad.


PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on the general approach before 
spending
: more time applying spit and polish.
> general approach seems reasonable to me.
I went back and forth on this.

JSON (or protobuf, or thrift, or or or...) would certainly make the RPC system 
far more robust and maintainable. But we'd lose the ability to actually use 
run_cluster interactively from the command line, because no one wants to write 
JSON or whatever by hand.

Right now the noun/verb word-based RPC allows for interactivity and that's how 
I did much of the early testing. It's not robust, but it won't be broken by 
simple things like e.g. a minicluster dir with a space (it would be broken by 
unexpected newlines though).

Do you think the pros of using a real serialization format outweigh the loss of 
interactivity?


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on the general approach before 
spending
: more time applying spit and polish.
> I went back and forth on this.
I'm just worried that, as we add features, the parsing is going to get more 
complicated than just noun/verb. eg what if I want to run a kerberized 
minicluster, but with a custom realm, or with two KDCs set up for cross-realm 
trust in the future, etc?

Maybe we could have the most simple "start a non-kerberized cluster with 3 
servers" be done with interactive use, since a developer might want to use this 
for local playing around, but then make anything more complicated (eg custom 
flags, more exotic configs) require JSON?


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(1 comment)

I think it's a great idea.  What about more machine-oriented interface for the 
CLI tool?  Do you expect it to be transformed into something JSON-like in the 
nearest future?

Maybe, it's worth introducing running a proxy along with minicluster and 
providing something like REST interface instead of CLI for the tests?

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 20: authz
nit here and below: I think it should be 'authn' -- the kerberos-related 
activity is related to authentication, not authorization (and Sentry 
integration is supposed to take care of the fine-grained authorization in the 
future)


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG
Commit Message:

PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get 
through a
: full run of "mvn verify" locally, so I have confidence that this 
can work.
: But I'd like to solicit feedback on the general approach before 
spending
: more time applying spit and polish.
general approach seems reasonable to me.

The only concern/question I have is about the formatting and compatibility 
requirements of the CLI requests/responses. Using something like protobuf or 
JSON would make it easier to ensure that the commands are extensible, 
potentially self-documenting, and machine parseable. On the other hand, 
protobuf at least would require more dependencies to be used in the embedding 
languages.

Given we have protobuf to/from-JSON support, maybe we could use protobuf to 
"define" the API, but use JSON to serialize it? Then we don't need to worry 
about parsing and generation and random things like "what if the minicluster 
base dir path has a space in it", etc.


-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7853

to look at the new patch set (#2).

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..

WIP: use C++ ExternalMiniCluster for Java and Python tests

Maintaining Kudu clients across various languages has been an ongoing
maintenance burden. Even when the client is just a thin wrapper around
another client (e.g. Kudu Python bindings), a great deal of work goes into
client testability. In practice, this has meant a bespoke mini cluster
implementation for each language. On the surface this doesn't seem that bad;
we just need to spawn some masters and tservers, right? Well, the work
quickly adds up:

o While the C++ mini cluster is heavily used and has seen many improvements,
  the Java mini cluster has not received the same kind of love, and is less
  robust as a result. KUDU-1976 is a great example of this deficiency.
o With the inclusion of authz came the addition of a "mini KDC", a special
  daemon for Kerberized mini clusters. It was originally implemented in C++
  and ported to Java, but has yet to be ported to the Python client; this is
  one of the obstacles towards porting full authz support to Python.
o Dan has been prototyping Hive Metastore and Sentry integration for Kudu,
  the testing of which will require "mini HMS" and possibly "mini Sentry"
  testing implementations in C++, Java, and eventually, Python.

In sum, good support for non-C++ mini clusters is an ongoing commitment and
requires a great deal of work. This work hasn't always been forthcoming, and
the non-C++ clusters are deficient as a result. But it doesn't have to be
this way! Here's a thought: what if we reused the C++ mini cluster for tests
written in these other languages? We could write a "proxy" application whose
job it is to manage the C++ mini cluster and expose a rudimentary API that's
easily programmable from Java and Python.

This patch attempts to do just that. It adds a "run_cluster" mode to the
Kudu CLI. When invoked, it spawns an ExternalMiniCluster and provides a
simple, machine-readable shell over stdin/stdout. The shell responds to
commands by manipulating the cluster and its daemons, and kills them when
the shell client disconnects. As a proof of concept, the patch also replaces
the bespoke Java mini cluster with callouts to the new shell.

I should add that I like the idea of shipping "run_cluster" into production
as part of the CLI, as it helps realize the vision of a single Kudu artifact
that can provide Kudu testability for any integrating product.

WIP because, well, it should be pretty obvious. I was able to get through a
full run of "mvn verify" locally, so I have confidence that this can work.
But I'd like to solicit feedback on the general approach before spending
more time applying spit and polish.

Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
---
M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java
D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java
D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java
D java/kudu-client/src/test/resources/flags
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/tool_action.h
A src/kudu/tools/tool_action_test.cc
M src/kudu/tools/tool_main.cc
15 files changed, 532 insertions(+), 1,211 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests

2017-08-28 Thread Adar Dembo (Code Review)
Hello Dan Burkert, Todd Lipcon,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7853

to review the following change.

Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests
..

WIP: use C++ ExternalMiniCluster for Java and Python tests

Maintaining Kudu clients across various languages has been an ongoing
maintenance burden. Even when the client is just a thin wrapper around
another client (e.g. Kudu Python bindings), a great deal of work goes into
client testability. In practice, this has meant a bespoke mini cluster
implementation for each language. On the surface this doesn't seem that bad;
we just need to spawn some masters and tservers, right? Well, the work
quickly adds up:

o While the C++ mini cluster is heavily used and has seen many improvements,
  the Java mini cluster has not received the same kind of love, and is less
  robust as a result. KUDU-1976 is a great example of this deficiency.
o With the inclusion of authz came the addition of a "mini KDC", a special
  daemon for Kerberized mini clusters. It was originally implemented in C++
  and ported to Java, but has yet to be ported to the Python client; this is
  one of the obstacles towards porting full authz support to Python.
o Dan has been prototyping Hive Metastore and Sentry integration for Kudu,
  the testing of which will require "mini HMS" and possibly "mini Sentry"
  testing implementations in C++, Java, and eventually, Python.

In sum, good support for non-C++ mini clusters is an ongoing commitment and
requires a great deal of work. This work hasn't always been forthcoming, and
the non-C++ clusters are deficient as a result. But it doesn't have to be
this way! Here's a thought: what if we reused the C++ mini cluster for tests
written in these other languages? We could write a "proxy" application whose
job it is to manage the C++ mini cluster and expose a rudimentary API that's
easily programmable from Java and Python.

This patch attempts to do just that. It adds a "run_cluster" mode to the
Kudu CLI. When invoked, it spawns an ExternalMiniCluster and provides a
simple, machine-readable shell over stdin/stdout. The shell responds to
commands by manipulating the cluster and its daemons, and kills them when
the shell client disconnects. As a proof of concept, the patch also replaces
the bespoke Java mini cluster with callouts to the new shell.

I should add that I like the idea of shipping "run_cluster" into production
as part of the CLI, as it helps realize the vision of a single Kudu artifact
that can provide Kudu testability for any integrating product.

WIP because, well, it should be pretty obvious. I was able to get through a
full run of "mvn verify" locally, so I have confidence that this can work.
But I'd like to solicit feedback on the general approach before spending
more time applying spit and polish.

Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
---
M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java
D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java
D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
M 
java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java
D java/kudu-client/src/test/resources/flags
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tools/CMakeLists.txt
M src/kudu/tools/tool_action.h
A src/kudu/tools/tool_action_test.cc
M src/kudu/tools/tool_main.cc
14 files changed, 532 insertions(+), 1,210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon