[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Hello Tidy Bot, Alexey Serbin, Dan Burkert, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7853 to look at the new patch set (#4). Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. WIP: use C++ ExternalMiniCluster for Java and Python tests Maintaining Kudu clients across various languages has been an ongoing maintenance burden. Even when the client is just a thin wrapper around another client (e.g. Kudu Python bindings), a great deal of work goes into client testability. In practice, this has meant a bespoke mini cluster implementation for each language. On the surface this doesn't seem that bad; we just need to spawn some masters and tservers, right? Well, the work quickly adds up: o While the C++ mini cluster is heavily used and has seen many improvements, the Java mini cluster has not received the same kind of love, and is less robust as a result. KUDU-1976 is a great example of this deficiency. o With the inclusion of authn came the addition of a "mini KDC", a special daemon for Kerberized mini clusters. It was originally implemented in C++ and ported to Java, but has yet to be ported to the Python client; this is one of the obstacles towards porting full authn support to Python. o Dan has been prototyping Hive Metastore and Sentry integration for Kudu, the testing of which will require "mini HMS" and possibly "mini Sentry" testing implementations in C++, Java, and eventually, Python. In sum, good support for non-C++ mini clusters is an ongoing commitment and requires a great deal of work. This work hasn't always been forthcoming, and the non-C++ clusters are deficient as a result. But it doesn't have to be this way! Here's a thought: what if we reused the C++ mini cluster for tests written in these other languages? We could write a "proxy" application whose job it is to manage the C++ mini cluster and expose a rudimentary API that's easily programmable from Java and Python. This patch attempts to do just that. It adds a "shell" mode to the Kudu CLI which provides a rudimentary control shell that can be used to spin up an ExternalMiniCluster. The shell is controlled via a wire protocol over stdin/stdout. The first cut of the protocol was sh-like, with simple word-based commands. It was then revised into a PB-based protocol (with optional JSON encoding) based on feedback. As a proof of concept, the patch also replaces the bespoke Java mini cluster with callouts to the new shell. I should add that I like the idea of shipping "shell" into production as part of the CLI, as it helps realize the vision of a single Kudu artifact that can provide Kudu testability for any integrating product. WIP because, well, it should be pretty obvious. I was able to get through a full run of "mvn verify" locally, so I have confidence that this can work. But I'd like to solicit feedback on the general approach before spending more time applying spit and polish. Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 --- M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java D java/kudu-client/src/test/resources/flags M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/tools/CMakeLists.txt A src/kudu/tools/tool.proto M src/kudu/tools/tool_action.h A src/kudu/tools/tool_action_test.cc M src/kudu/tools/tool_main.cc 17 files changed, 919 insertions(+), 1,256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/4 -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-Change-Number: 7853 Gerrit-PatchSet: 4 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/7853 ) Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG@47 PS2, Line 47: : WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on > I'm on board with protobuf for schema / json as encoded format as well. OK, PS3 changes the interface to use a protobuf-based schema with an optional JSON serialization. The former provides more type safety for clients who are willing to eat the protobuf dependency and the latter for those who aren't. -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-Change-Number: 7853 Gerrit-PatchSet: 3 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 27 Sep 2017 02:13:20 + Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Hello Tidy Bot, Alexey Serbin, Dan Burkert, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7853 to look at the new patch set (#3). Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. WIP: use C++ ExternalMiniCluster for Java and Python tests Maintaining Kudu clients across various languages has been an ongoing maintenance burden. Even when the client is just a thin wrapper around another client (e.g. Kudu Python bindings), a great deal of work goes into client testability. In practice, this has meant a bespoke mini cluster implementation for each language. On the surface this doesn't seem that bad; we just need to spawn some masters and tservers, right? Well, the work quickly adds up: o While the C++ mini cluster is heavily used and has seen many improvements, the Java mini cluster has not received the same kind of love, and is less robust as a result. KUDU-1976 is a great example of this deficiency. o With the inclusion of authz came the addition of a "mini KDC", a special daemon for Kerberized mini clusters. It was originally implemented in C++ and ported to Java, but has yet to be ported to the Python client; this is one of the obstacles towards porting full authz support to Python. o Dan has been prototyping Hive Metastore and Sentry integration for Kudu, the testing of which will require "mini HMS" and possibly "mini Sentry" testing implementations in C++, Java, and eventually, Python. In sum, good support for non-C++ mini clusters is an ongoing commitment and requires a great deal of work. This work hasn't always been forthcoming, and the non-C++ clusters are deficient as a result. But it doesn't have to be this way! Here's a thought: what if we reused the C++ mini cluster for tests written in these other languages? We could write a "proxy" application whose job it is to manage the C++ mini cluster and expose a rudimentary API that's easily programmable from Java and Python. This patch attempts to do just that. It adds a "shell" mode to the Kudu CLI which provides a rudimentary control shell that can be used to spin up an ExternalMiniCluster. The shell is controlled via a wire protocol over stdin/stdout. The first cut of the protocol was sh-like, with simple word-based commands. It was then revised into a PB-based protocol (with optional JSON encoding) based on feedback. As a proof of concept, the patch also replaces the bespoke Java mini cluster with callouts to the new shell. I should add that I like the idea of shipping "shell" into production as part of the CLI, as it helps realize the vision of a single Kudu artifact that can provide Kudu testability for any integrating product. WIP because, well, it should be pretty obvious. I was able to get through a full run of "mvn verify" locally, so I have confidence that this can work. But I'd like to solicit feedback on the general approach before spending more time applying spit and polish. Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 --- M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java D java/kudu-client/src/test/resources/flags M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/test/mini_kdc.cc M src/kudu/tools/CMakeLists.txt A src/kudu/tools/tool.proto M src/kudu/tools/tool_action.h A src/kudu/tools/tool_action_test.cc M src/kudu/tools/tool_main.cc 17 files changed, 919 insertions(+), 1,256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/3 -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-Change-Number: 7853 Gerrit-PatchSet: 3 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Dan Burkert has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 36: run_cluster bikesheddy, but I'd prefer 'kudu mini_cluster'. Mini cluster is a pretty well established term of art. PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on the general approach before spending : more time applying spit and polish. > Would it be possible to use JSON format as an internal format of the tool, I'm on board with protobuf for schema / json as encoded format as well. http://gerrit.cloudera.org:8080/#/c/7853/2/java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java File java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java: Line 83: if (responses.size() == 0) { Is this possible? I would expect not due to the do part of do/while. -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Alexey Serbin has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on the general approach before spending : more time applying spit and polish. > I went back and forth on this. Would it be possible to use JSON format as an internal format of the tool, while translating the command-line arguments into a JSON document which would be consumed internally? Basically, it's about a shim layer which translates the command-line arguments into a JSON representation. Vice versa: it could be the same for the output. As for the output, there are libraries like libxo to handle that in a uniform way: https://github.com/Juniper/libxo -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Adar Dembo has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (2 comments) > I think it's a great idea. What about more machine-oriented > interface for the CLI tool? Do you expect it to be transformed > into something JSON-like in the nearest future? See the comment Todd left; we're discussing just that. > Maybe, it's worth introducing running a proxy along with > minicluster and providing something like REST interface instead of > CLI for the tests? I didn't implement a TCP-based connection between the proxy and the tests exactly so that the entire "port already in use" class of issues can be avoided. When communication is over a TCP socket, we either need to use a well-known port, which is prone to conflicts, or an ephemeral port, whose number needs to be communicated back to the tests. If we've already got a channel for communicating the port number (probably stdout), we can use that channel for control too. I think a socket-based approach would make more sense if there were multiple consumers of a single mini cluster, but that's just not the case with our tests. http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 20: authz > nit here and below: I think it should be 'authn' -- the kerberos-related ac You are right, my bad. PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on the general approach before spending : more time applying spit and polish. > general approach seems reasonable to me. I went back and forth on this. JSON (or protobuf, or thrift, or or or...) would certainly make the RPC system far more robust and maintainable. But we'd lose the ability to actually use run_cluster interactively from the command line, because no one wants to write JSON or whatever by hand. Right now the noun/verb word-based RPC allows for interactivity and that's how I did much of the early testing. It's not robust, but it won't be broken by simple things like e.g. a minicluster dir with a space (it would be broken by unexpected newlines though). Do you think the pros of using a real serialization format outweigh the loss of interactivity? -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Todd Lipcon has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on the general approach before spending : more time applying spit and polish. > I went back and forth on this. I'm just worried that, as we add features, the parsing is going to get more complicated than just noun/verb. eg what if I want to run a kerberized minicluster, but with a custom realm, or with two KDCs set up for cross-realm trust in the future, etc? Maybe we could have the most simple "start a non-kerberized cluster with 3 servers" be done with interactive use, since a developer might want to use this for local playing around, but then make anything more complicated (eg custom flags, more exotic configs) require JSON? -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Alexey Serbin has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (1 comment) I think it's a great idea. What about more machine-oriented interface for the CLI tool? Do you expect it to be transformed into something JSON-like in the nearest future? Maybe, it's worth introducing running a proxy along with minicluster and providing something like REST interface instead of CLI for the tests? http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 20: authz nit here and below: I think it should be 'authn' -- the kerberos-related activity is related to authentication, not authorization (and Sentry integration is supposed to take care of the fine-grained authorization in the future) -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Todd Lipcon has posted comments on this change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7853/2//COMMIT_MSG Commit Message: PS2, Line 47: WIP because, well, it should be pretty obvious. I was able to get through a : full run of "mvn verify" locally, so I have confidence that this can work. : But I'd like to solicit feedback on the general approach before spending : more time applying spit and polish. general approach seems reasonable to me. The only concern/question I have is about the formatting and compatibility requirements of the CLI requests/responses. Using something like protobuf or JSON would make it easier to ensure that the commands are extensible, potentially self-documenting, and machine parseable. On the other hand, protobuf at least would require more dependencies to be used in the embedding languages. Given we have protobuf to/from-JSON support, maybe we could use protobuf to "define" the API, but use JSON to serialize it? Then we don't need to worry about parsing and generation and random things like "what if the minicluster base dir path has a space in it", etc. -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7853 to look at the new patch set (#2). Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. WIP: use C++ ExternalMiniCluster for Java and Python tests Maintaining Kudu clients across various languages has been an ongoing maintenance burden. Even when the client is just a thin wrapper around another client (e.g. Kudu Python bindings), a great deal of work goes into client testability. In practice, this has meant a bespoke mini cluster implementation for each language. On the surface this doesn't seem that bad; we just need to spawn some masters and tservers, right? Well, the work quickly adds up: o While the C++ mini cluster is heavily used and has seen many improvements, the Java mini cluster has not received the same kind of love, and is less robust as a result. KUDU-1976 is a great example of this deficiency. o With the inclusion of authz came the addition of a "mini KDC", a special daemon for Kerberized mini clusters. It was originally implemented in C++ and ported to Java, but has yet to be ported to the Python client; this is one of the obstacles towards porting full authz support to Python. o Dan has been prototyping Hive Metastore and Sentry integration for Kudu, the testing of which will require "mini HMS" and possibly "mini Sentry" testing implementations in C++, Java, and eventually, Python. In sum, good support for non-C++ mini clusters is an ongoing commitment and requires a great deal of work. This work hasn't always been forthcoming, and the non-C++ clusters are deficient as a result. But it doesn't have to be this way! Here's a thought: what if we reused the C++ mini cluster for tests written in these other languages? We could write a "proxy" application whose job it is to manage the C++ mini cluster and expose a rudimentary API that's easily programmable from Java and Python. This patch attempts to do just that. It adds a "run_cluster" mode to the Kudu CLI. When invoked, it spawns an ExternalMiniCluster and provides a simple, machine-readable shell over stdin/stdout. The shell responds to commands by manipulating the cluster and its daemons, and kills them when the shell client disconnects. As a proof of concept, the patch also replaces the bespoke Java mini cluster with callouts to the new shell. I should add that I like the idea of shipping "run_cluster" into production as part of the CLI, as it helps realize the vision of a single Kudu artifact that can provide Kudu testability for any integrating product. WIP because, well, it should be pretty obvious. I was able to get through a full run of "mvn verify" locally, so I have confidence that this can work. But I'd like to solicit feedback on the general approach before spending more time applying spit and polish. Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 --- M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java D java/kudu-client/src/test/resources/flags M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala M src/kudu/security/test/mini_kdc.cc M src/kudu/tools/CMakeLists.txt M src/kudu/tools/tool_action.h A src/kudu/tools/tool_action_test.cc M src/kudu/tools/tool_main.cc 15 files changed, 532 insertions(+), 1,211 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/2 -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: use C++ ExternalMiniCluster for Java and Python tests
Hello Dan Burkert, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/7853 to review the following change. Change subject: WIP: use C++ ExternalMiniCluster for Java and Python tests .. WIP: use C++ ExternalMiniCluster for Java and Python tests Maintaining Kudu clients across various languages has been an ongoing maintenance burden. Even when the client is just a thin wrapper around another client (e.g. Kudu Python bindings), a great deal of work goes into client testability. In practice, this has meant a bespoke mini cluster implementation for each language. On the surface this doesn't seem that bad; we just need to spawn some masters and tservers, right? Well, the work quickly adds up: o While the C++ mini cluster is heavily used and has seen many improvements, the Java mini cluster has not received the same kind of love, and is less robust as a result. KUDU-1976 is a great example of this deficiency. o With the inclusion of authz came the addition of a "mini KDC", a special daemon for Kerberized mini clusters. It was originally implemented in C++ and ported to Java, but has yet to be ported to the Python client; this is one of the obstacles towards porting full authz support to Python. o Dan has been prototyping Hive Metastore and Sentry integration for Kudu, the testing of which will require "mini HMS" and possibly "mini Sentry" testing implementations in C++, Java, and eventually, Python. In sum, good support for non-C++ mini clusters is an ongoing commitment and requires a great deal of work. This work hasn't always been forthcoming, and the non-C++ clusters are deficient as a result. But it doesn't have to be this way! Here's a thought: what if we reused the C++ mini cluster for tests written in these other languages? We could write a "proxy" application whose job it is to manage the C++ mini cluster and expose a rudimentary API that's easily programmable from Java and Python. This patch attempts to do just that. It adds a "run_cluster" mode to the Kudu CLI. When invoked, it spawns an ExternalMiniCluster and provides a simple, machine-readable shell over stdin/stdout. The shell responds to commands by manipulating the cluster and its daemons, and kills them when the shell client disconnects. As a proof of concept, the patch also replaces the bespoke Java mini cluster with callouts to the new shell. I should add that I like the idea of shipping "run_cluster" into production as part of the CLI, as it helps realize the vision of a single Kudu artifact that can provide Kudu testability for any integrating product. WIP because, well, it should be pretty obvious. I was able to get through a full run of "mvn verify" locally, so I have confidence that this can work. But I'd like to solicit feedback on the general approach before spending more time applying spit and polish. Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 --- M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java D java/kudu-client/src/test/java/org/apache/kudu/client/MiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestClientFailoverSupport.java D java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKdc.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestMultipleLeaderFailover.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java D java/kudu-client/src/test/resources/flags M src/kudu/security/test/mini_kdc.cc M src/kudu/tools/CMakeLists.txt M src/kudu/tools/tool_action.h A src/kudu/tools/tool_action_test.cc M src/kudu/tools/tool_main.cc 14 files changed, 532 insertions(+), 1,210 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/7853/1 -- To view, visit http://gerrit.cloudera.org:8080/7853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0e693921ef780dc4a06e536c6b7408f7f0b252f6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon