> What do you mean by that?
Sorry, poor phrasing - currently the Beam project has the build path with
unit tests (no Docker there) and the project IT environment which can use
Docker.
A binary only approach could potentially be managed without adding a
dependency on Docker - but has other issues summarised below.

> For Kudu-internal testing I think we could stick to running "kudu
minicluster
Yes.

> ... external use cases, we could switch that to "docker run
kudu:minicluster:1.7.0"
I think this makes good sense.


In summary:

1) Fake a Kudu master in Java - difficult unless simplified, not
representative if simplified, code maintenance issue
2) Mocking the Kudu client - verbose unless only covering simple scenarios
3) Use mini cluster with binaries - portability challenge of binaries, need
to script caching the binaries / use of some repository, unfamiliar build
tasks with binary handling (unless built to work with something like
maven), possible could see linking problems
4) Docker - predictable, adds a dependency, existing Kudu images not
"managed" at the moment

For Beam I think I will put most effort into IT which can use Docker or an
existing cluster and then mock a Java KuduClient for some basic sanity
tests for the build path.

On Docker:
- to get current versions [e.g. 1] working I found I had to edit
/etc/hosts. I think the mini cluster version with the FakeDNS might avoid
that?
- Kudu docs currently encourage the Cloudera Quickstart VM over Docker [2,3]

Do you think the Kudu project could provide an image allowing "docker run
kudu:minicluster:1.x.x" as part of the release cycle?

Thanks again,
Tim

[1] https://github.com/MartinWeindel/kudu-docker
[2] https://kudu.apache.org/docs/quickstart.html#quickstart_vm
[3] https://github.com/cloudera/kudu-examples/wiki/Docker-based-tutorial

On Sat, Jun 30, 2018 at 2:22 AM, Todd Lipcon <t...@cloudera.com.invalid>
wrote:

> On Fri, Jun 29, 2018 at 1:23 PM, Tim Robertson <timrobertson...@gmail.com>
> wrote:
>
> > Thanks Mike, Todd - I greatly appreciate the inputs.
> >
> > > How many platforms would need to be supported for it to be viable for
> > Beam?
> > The minimal for it to be considered would probably(!) be ubuntu, centos,
> > osx. Incidentally it was actually the protobuf approach that make me
> > consider this.
> >
> > > What about depending on a docker container than runs the kudu
> > minicluster in
> > "host" networking mode?
> > I've also pondered this a little but like Attila raises it puts a lot of
> > burden for other project developers. Mmmm...
> >
>
> What do you mean by that? For Kudu-internal testing I think we could stick
> to running "kudu minicluster" as is. For external use cases, we could
> switch that to "docker run kudu:minicluster:1.7.0" or whatever, and it
> would auto-download from dockerhub as necessary, right?
>
>
> >
> > Ismaël (Beam PMC) has suggested I stick to mocking given the complexity
> of
> > the things I'm exploring.
> >
> > As another idea:
> > I briefly pondered writing a "FakeKudu Java server" - data held in
> memory,
> > no partitioning, protobuf messaging, handling table metadata, checking
> > schemas on write, predicate and projected columns for scan, faking
> kerberos
> > (if possible). It didn't seem particularly difficult to do but I fear a
> > maintenance burden for a small audience.
> >
> >
> Yea, I think that would be quite a maintenance burden, especially as new
> features are added over time. I suppose in many cases you could omit things
> or stub things out, but then the behavior will begin to differ and it won't
> really be that clear that your tests actually are representative.
>
>
> > Could utilities in Kudu that help folk test Java clients be of interest
> to
> > others? - e.g. preconfigured mock objects for various scenarios. If so,
> I'd
> > be happy to discuss options and offer PRs in Kudu.
> >
> > Thanks,
> > Tim
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Jun 29, 2018 at 9:34 PM, Todd Lipcon <t...@cloudera.com.invalid>
> > wrote:
> >
> > > On Fri, Jun 29, 2018 at 12:31 PM, Mike Percy <mpe...@apache.org>
> wrote:
> > >
> > > > This is something I've been thinking about and toying with and I'd
> like
> > > to
> > > > see if we can't get binaries available via Maven for at least one
> > > platform
> > > > (say, RHEL 7). Similar to how protobuf does it.
> > > >
> > >
> > > What about depending on a docker container than runs the kudu
> minicluster
> > > in "host" networking mode? eg https://github.com/
> > MartinWeindel/kudu-docker
> > > is one possibility
> > >
> > >
> > > > How many platforms would need to be supported for it to be viable for
> > > Beam?
> > > >
> > > > Thanks,
> > > > Mike
> > > >
> > > > On Fri, Jun 29, 2018 at 10:01 AM Tim <timrobertson...@gmail.com>
> > wrote:
> > > >
> > > > > Thanks Attila
> > > > >
> > > > > That’s great feedback and helpful for me to reference as guidance.
> > > > >
> > > > > By “Kudu installation” I was referring to the possibility that an
> > > install
> > > > > might set config etc, beyond just having the binary. I got it
> running
> > > on
> > > > > CentOS similar to how you outline now.
> > > > >
> > > > > I too believe mocking makes most sense, especially as we have the
> IT
> > > > > running as well, but was asked to explore this further. It’s useful
> > to
> > > > know
> > > > > you’d agree.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Tim
> > > > >
> > > > > > On 29 Jun 2018, at 17:33, Attila Bukor <abu...@cloudera.com>
> > wrote:
> > > > > >
> > > > > > Hi Tim,
> > > > > >
> > > > > > I’m not sure what you mean by relying on actual installations. If
> > you
> > > > > have the kudu, kudu-master and kudu-tserver binaries at the same
> > > location
> > > > > and they can be executed, MiniKuduCluster can be used (“binDir”
> > > property
> > > > > should be set to the directory containing the Kudu binaries). You
> > > should
> > > > > also look into BaseKuduTest as that will set up the MiniKuduCluster
> > for
> > > > you
> > > > > and you don’t have to do it manually.
> > > > > >
> > > > > > Extracting the Kudu binaries from an rpm should probably work,
> but
> > > that
> > > > > binds you to CDH as currently Cloudera is the only one that ships
> > Kudu
> > > > > binaries and MacOS builds are not available anywhere afaik. Also,
> > 1.4.0
> > > > is
> > > > > around a year old, you might want to use this repository instead
> > (from
> > > > CDH
> > > > > 5.13 Kudu is part of the CDH):
> > > > > http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5/
> > > > RPMS/x86_64/kudu-1.7.0+cdh5.15.0+0-1.cdh5.15.0.p0.52.el7.x86_64.rpm
> > > > > >
> > > > > > As a general suggestion, I would recommend mocking Kudu for unit
> > > tests
> > > > > (that’s what a unit test is for after all) and create separate
> > > > integration
> > > > > tests that actually use Kudu that can be skipped where Kudu is not
> > > > > available. Of course the CI should be set up to be able to provide
> > all
> > > > > necessary integrations for the tests, but a developer wouldn’t have
> > to
> > > > set
> > > > > up Kudu, or use Docker to run the tests if their change doesn’t
> > affect
> > > > the
> > > > > Kudu integration.
> > > > > >
> > > > > > Attila
> > > > > >
> > > > > >> On 2018. Jun 29., at 16:42, Tim Robertson <
> > > timrobertson...@gmail.com>
> > > > > wrote:
> > > > > >>
> > > > > >> Hi folks,
> > > > > >>
> > > > > >> I've written Java KuduIO for Apache Beam with integration tests
> > > making
> > > > > use
> > > > > >> of Kudu in Docker.  It is yet to be committed on Apache Beam.
> > > > > >>
> > > > > >> Rather than mocking Kudu client for unit tests I'd like to
> explore
> > > use
> > > > > of
> > > > > >> the MiniKuduCluster which "Depends on precompiled kudu,
> > kudu-master,
> > > > and
> > > > > >> kudu-tserver binaries".
> > > > > >>
> > > > > >> I'd need unit tests to run on the main linux distros and OS X.
> > > > > >>
> > > > > >> For the linux distros, would an approach where I extract the
> > > binaries
> > > > > from
> > > > > >> the packages [1] work please? Or does the MiniKuduCluster rely
> on
> > > > actual
> > > > > >> installations? I am pretty weak on C builds and linked libraries
> > etc
> > > > > (Java
> > > > > >> guy, sorry).
> > > > > >>
> > > > > >> For CentOS I'm exploring this for example:
> > > > > >>  rpm2cpio ./kudu-1.4.0+cdh5.12.2+0-1.
> > cdh5.12.2.p0.8.el7.x86_64.rpm
> > > |
> > > > > cpio
> > > > > >> -idmv
> > > > > >>
> > > > > >> I haven't explored OS X options yet.
> > > > > >>
> > > > > >> Any advice here would greatly be appreciated to save me going
> > down a
> > > > > dead
> > > > > >> end.
> > > > > >>
> > > > > >> Many thanks,
> > > > > >> Tim
> > > > > >>
> > > > > >>
> > > > > >> [1] http://kudu.apache.org/docs/installation.html#install_
> > packages
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> > >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Reply via email to