> What do you mean by that? Sorry, poor phrasing - currently the Beam project has the build path with unit tests (no Docker there) and the project IT environment which can use Docker. A binary only approach could potentially be managed without adding a dependency on Docker - but has other issues summarised below.
> For Kudu-internal testing I think we could stick to running "kudu minicluster Yes. > ... external use cases, we could switch that to "docker run kudu:minicluster:1.7.0" I think this makes good sense. In summary: 1) Fake a Kudu master in Java - difficult unless simplified, not representative if simplified, code maintenance issue 2) Mocking the Kudu client - verbose unless only covering simple scenarios 3) Use mini cluster with binaries - portability challenge of binaries, need to script caching the binaries / use of some repository, unfamiliar build tasks with binary handling (unless built to work with something like maven), possible could see linking problems 4) Docker - predictable, adds a dependency, existing Kudu images not "managed" at the moment For Beam I think I will put most effort into IT which can use Docker or an existing cluster and then mock a Java KuduClient for some basic sanity tests for the build path. On Docker: - to get current versions [e.g. 1] working I found I had to edit /etc/hosts. I think the mini cluster version with the FakeDNS might avoid that? - Kudu docs currently encourage the Cloudera Quickstart VM over Docker [2,3] Do you think the Kudu project could provide an image allowing "docker run kudu:minicluster:1.x.x" as part of the release cycle? Thanks again, Tim [1] https://github.com/MartinWeindel/kudu-docker [2] https://kudu.apache.org/docs/quickstart.html#quickstart_vm [3] https://github.com/cloudera/kudu-examples/wiki/Docker-based-tutorial On Sat, Jun 30, 2018 at 2:22 AM, Todd Lipcon <t...@cloudera.com.invalid> wrote: > On Fri, Jun 29, 2018 at 1:23 PM, Tim Robertson <timrobertson...@gmail.com> > wrote: > > > Thanks Mike, Todd - I greatly appreciate the inputs. > > > > > How many platforms would need to be supported for it to be viable for > > Beam? > > The minimal for it to be considered would probably(!) be ubuntu, centos, > > osx. Incidentally it was actually the protobuf approach that make me > > consider this. > > > > > What about depending on a docker container than runs the kudu > > minicluster in > > "host" networking mode? > > I've also pondered this a little but like Attila raises it puts a lot of > > burden for other project developers. Mmmm... > > > > What do you mean by that? For Kudu-internal testing I think we could stick > to running "kudu minicluster" as is. For external use cases, we could > switch that to "docker run kudu:minicluster:1.7.0" or whatever, and it > would auto-download from dockerhub as necessary, right? > > > > > > Ismaël (Beam PMC) has suggested I stick to mocking given the complexity > of > > the things I'm exploring. > > > > As another idea: > > I briefly pondered writing a "FakeKudu Java server" - data held in > memory, > > no partitioning, protobuf messaging, handling table metadata, checking > > schemas on write, predicate and projected columns for scan, faking > kerberos > > (if possible). It didn't seem particularly difficult to do but I fear a > > maintenance burden for a small audience. > > > > > Yea, I think that would be quite a maintenance burden, especially as new > features are added over time. I suppose in many cases you could omit things > or stub things out, but then the behavior will begin to differ and it won't > really be that clear that your tests actually are representative. > > > > Could utilities in Kudu that help folk test Java clients be of interest > to > > others? - e.g. preconfigured mock objects for various scenarios. If so, > I'd > > be happy to discuss options and offer PRs in Kudu. > > > > Thanks, > > Tim > > > > > > > > > > > > > > > > > > > > On Fri, Jun 29, 2018 at 9:34 PM, Todd Lipcon <t...@cloudera.com.invalid> > > wrote: > > > > > On Fri, Jun 29, 2018 at 12:31 PM, Mike Percy <mpe...@apache.org> > wrote: > > > > > > > This is something I've been thinking about and toying with and I'd > like > > > to > > > > see if we can't get binaries available via Maven for at least one > > > platform > > > > (say, RHEL 7). Similar to how protobuf does it. > > > > > > > > > > What about depending on a docker container than runs the kudu > minicluster > > > in "host" networking mode? eg https://github.com/ > > MartinWeindel/kudu-docker > > > is one possibility > > > > > > > > > > How many platforms would need to be supported for it to be viable for > > > Beam? > > > > > > > > Thanks, > > > > Mike > > > > > > > > On Fri, Jun 29, 2018 at 10:01 AM Tim <timrobertson...@gmail.com> > > wrote: > > > > > > > > > Thanks Attila > > > > > > > > > > That’s great feedback and helpful for me to reference as guidance. > > > > > > > > > > By “Kudu installation” I was referring to the possibility that an > > > install > > > > > might set config etc, beyond just having the binary. I got it > running > > > on > > > > > CentOS similar to how you outline now. > > > > > > > > > > I too believe mocking makes most sense, especially as we have the > IT > > > > > running as well, but was asked to explore this further. It’s useful > > to > > > > know > > > > > you’d agree. > > > > > > > > > > Thanks > > > > > > > > > > Tim > > > > > > > > > > > On 29 Jun 2018, at 17:33, Attila Bukor <abu...@cloudera.com> > > wrote: > > > > > > > > > > > > Hi Tim, > > > > > > > > > > > > I’m not sure what you mean by relying on actual installations. If > > you > > > > > have the kudu, kudu-master and kudu-tserver binaries at the same > > > location > > > > > and they can be executed, MiniKuduCluster can be used (“binDir” > > > property > > > > > should be set to the directory containing the Kudu binaries). You > > > should > > > > > also look into BaseKuduTest as that will set up the MiniKuduCluster > > for > > > > you > > > > > and you don’t have to do it manually. > > > > > > > > > > > > Extracting the Kudu binaries from an rpm should probably work, > but > > > that > > > > > binds you to CDH as currently Cloudera is the only one that ships > > Kudu > > > > > binaries and MacOS builds are not available anywhere afaik. Also, > > 1.4.0 > > > > is > > > > > around a year old, you might want to use this repository instead > > (from > > > > CDH > > > > > 5.13 Kudu is part of the CDH): > > > > > http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5/ > > > > RPMS/x86_64/kudu-1.7.0+cdh5.15.0+0-1.cdh5.15.0.p0.52.el7.x86_64.rpm > > > > > > > > > > > > As a general suggestion, I would recommend mocking Kudu for unit > > > tests > > > > > (that’s what a unit test is for after all) and create separate > > > > integration > > > > > tests that actually use Kudu that can be skipped where Kudu is not > > > > > available. Of course the CI should be set up to be able to provide > > all > > > > > necessary integrations for the tests, but a developer wouldn’t have > > to > > > > set > > > > > up Kudu, or use Docker to run the tests if their change doesn’t > > affect > > > > the > > > > > Kudu integration. > > > > > > > > > > > > Attila > > > > > > > > > > > >> On 2018. Jun 29., at 16:42, Tim Robertson < > > > timrobertson...@gmail.com> > > > > > wrote: > > > > > >> > > > > > >> Hi folks, > > > > > >> > > > > > >> I've written Java KuduIO for Apache Beam with integration tests > > > making > > > > > use > > > > > >> of Kudu in Docker. It is yet to be committed on Apache Beam. > > > > > >> > > > > > >> Rather than mocking Kudu client for unit tests I'd like to > explore > > > use > > > > > of > > > > > >> the MiniKuduCluster which "Depends on precompiled kudu, > > kudu-master, > > > > and > > > > > >> kudu-tserver binaries". > > > > > >> > > > > > >> I'd need unit tests to run on the main linux distros and OS X. > > > > > >> > > > > > >> For the linux distros, would an approach where I extract the > > > binaries > > > > > from > > > > > >> the packages [1] work please? Or does the MiniKuduCluster rely > on > > > > actual > > > > > >> installations? I am pretty weak on C builds and linked libraries > > etc > > > > > (Java > > > > > >> guy, sorry). > > > > > >> > > > > > >> For CentOS I'm exploring this for example: > > > > > >> rpm2cpio ./kudu-1.4.0+cdh5.12.2+0-1. > > cdh5.12.2.p0.8.el7.x86_64.rpm > > > | > > > > > cpio > > > > > >> -idmv > > > > > >> > > > > > >> I haven't explored OS X options yet. > > > > > >> > > > > > >> Any advice here would greatly be appreciated to save me going > > down a > > > > > dead > > > > > >> end. > > > > > >> > > > > > >> Many thanks, > > > > > >> Tim > > > > > >> > > > > > >> > > > > > >> [1] http://kudu.apache.org/docs/installation.html#install_ > > packages > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Todd Lipcon > > > Software Engineer, Cloudera > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >