Yeah, I was thinking more like your second paragraph. Thinking I would use the proposed client jar to develop against the MiniAccumuloCluster (typically the StandaloneMiniAccumuloCluster for me) and then deploy that code to run against a real cluster. Would like to flesh that usecase out a little more. Do you think it has to be another jar on top of the client jar?
On Fri, Jan 5, 2018 at 4:31 PM Josh Elser <josh.el...@gmail.com> wrote: > MAC, in its common state, is probably not something we'd want to include > in this proposed tarball. The reasoning being that MAC (and related > classes) aren't something that people would need on your "Hadoop > Cluster" to talk to Accumulo. It's something that can just be obtained > via Maven. > > However, if you're more referring to MAC as the generic > "AccumuloCluster" interface (an attempt to make running tests against > MAC and a real Accumulo cluster transparent -- > StandaloneAccumuloCluster), then I could see some JAR that we'd include > which would contain the necessary classes (on top of > accumulo-client.jar) for users to run code seamlessly against a > traditional MAC or the StandaloneAccumuloCluster. > > On 1/5/18 4:22 PM, Michael Wall wrote: > > I like the idea of a client jar that has less dependencies. Josh, where > > are thinking the MiniAccumuloCluster fits in here? > > > > On Fri, Jan 5, 2018 at 3:57 PM Christopher <ctubb...@apache.org> wrote: > > > >> On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <ke...@deenlo.com> wrote: > >> > >>> On Thu, Jan 4, 2018 at 7:43 PM, Christopher <ctubb...@apache.org> > wrote: > >>>> tl;dr : I would prefer not to add another tarball as part of our > >>> "official" > >>> > >>> I am not opposed to replacing the current single tarball with client > >>> and server tarballs. What I find appealing about this is if the > >>> client tarball has less deps. > >>> > >>> However I think a lot of thought should be put into the scripts if > >>> this is done. For example the client tar and server tar should > >>> probably not both have accumulo commands that do different things. > >>> > >>> > >> Agreed on Keith's point about the scripts and it requiring some > >> consideration. > >> > >> > >>>> releases, but I'd be in favor of a blog instructions, script, or build > >>>> profile, which users could read/execute/activate to create a > >>> client-centric > >>>> package. > >>>> > >>>> I've long believed that supporting different downstream packaging > >>> scenarios > >>>> should be prioritized over upstream binary packaging. I have argued in > >>> > >>> These "downstream" packaging could be done within the Apache Accumulo > >>> project also. Like accumulo-docker. Creating other packaging > >>> projects within Accumulo is something to consider. > >>> > >>> > >> +1; When I say "downstream", it's a role, not an entity. The point is > that > >> it's a distinct activity. accumulo-docker is a perfect example of a > >> "downstream packaging" project maintained by the upstream community. I > find > >> it frustrating sometimes when supporting users that they can't tell the > >> difference between what is "Accumulo" and what is "this specific > >> packaging/configuration/deployment of Accumulo", because we don't make > >> those lines clear. I think we can draw these lines a bit more clearly. > >> > >> > >>>> favor of removing our current tarball entirely, while supporting > >> efforts > >>> to > >>> > >>> Apache Accumulo needs some sort of tarball that makes it easy to run > >>> the code on a cluster, otherwise how can we test Accumulo on a cluster > >>> for releases? > >>> > >>> > >> A binary tarball may be the best for this, but it's little more than the > >> jars in Maven Central and a few text files. It could be trivially > replaced > >> with a simple script and manifest; it could also be replaced with an > RPM, a > >> docker image, or any number of things. A tarball is just one type of > >> packaging for Accumulo's binaries. > >> > >> In any case, I wasn't talking about removing the ability to produce a > >> binary tarball from source. Only removing it from our release artifacts > and > >> downloads. It is not a popular opinion, but I still think it's > reasonable, > >> with both pros and cons. > >> > >> > >>>> enable downstream packaging by modularizing the server code, > >> supporting a > >>>> client-API jar (future work), and decoupling code from launch scripts. > >> I > >>>> think we should continue to do these kinds of improvements to support > >>>> different packaging scenarios downstream, but I'd prefer to avoid > >>>> additional "official" binary releases. > >>> > >>> I agree, I think if the Accumulo Java code made less assumptions about > >>> its runtime env it would result in code that is easier to maintain and > >>> package for different environments. > >>> > >>> In Fluo we have recently done a lot of work in order to support > >>> Docker, Mesos, and Kubernetes. This work has really cleaned up the > >>> core Fluo code making it easier to run in any environment. > >>> > >>> I suspect pulling the Accumuo tar ball into a separate git repo and > >>> out of the main repo may help highlight some of the assumptions > >>> Accumulo Java code makes about the environment. > >>> > >>> > >> This is basically what the assemble module is now. It's why I moved the > bin > >> and conf directories into it, and have made its dependencies optional so > >> they wouldn't be resolved transitively, and why I made the assembly > plugin > >> gather up the libs instead of the dependency plugin which used to drop > them > >> in a lib directory at the root of the source checkout. This module is > the > >> "downstream packaging" for the current "all-in-one" binary tarball > package. > >> > >> > >>> I think these clean up issues are related to what Josh is suggesting, > >>> but are not prerequisites. So it makes sense to discuss them at this > >>> point, but I don't think they should block work on two tarballs if > >>> that seems like a good idea. > >>> > >>> > >> Agreed. That discussion can be deferred. Much depends on how it is to be > >> split up. > >> > >> > >>>> > >>>> Rather than provide additional packages, I'd prefer to work with > >>> downstream > >>>> to make the source more "packagable" to suit the needs of these > >>> downstream > >>>> vendor/community packagers. One way we can do that here is by either > >>>> documenting what would be needed in a client-centric package, or by > >>>> providing a script or build profile to create it from source, so that > >>> your > >>>> $dayjob or any other downstream packager doesn't have to figure that > >> out > >>>> from scratch. > >>>> > >>>> On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <josh.el...@gmail.com> > >> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> $dayjob presented me with a request to break up the current tarball > >> into > >>>>> two: one suitable for "users" and another for the Accumulo services. > >> The > >>>>> ultimate goal is to make upgrade scenarios a bit easier by having > >> client > >>>>> and server centric packaging. > >>>>> > >>>>> The "client" tarball would be something suitable for most users > >>>>> providing the ability to do things like: > >>>>> > >>>>> * Launch a java app against Accumulo > >>>>> * Launch a MapReduce job against Accumulo > >>>>> * Launch the Accumulo shell > >>>>> > >>>>> Essentially, the client tarball is just a pared down version of our > >>>>> "current" tarball and the server-tarball is likely equivalent to our > >>>>> "current" tarball (given that we have little code which would be > >>>>> considered client-only). > >>>>> > >>>>> Obviously, there are many ways to go about this. If there is buy-in > >> from > >>>>> other folks, adding some new assembly descriptors and making it a > part > >>>>> of the Maven build (perhaps, optionally generated) would be the > >> easiest > >>>>> in terms of maintenance. However, I don't want to push for that if > >> it's > >>>>> just going to be ignored by folks. I'll be creating something to > >> support > >>>>> this one way or another. > >>>>> > >>>>> Any thoughts/opinions? Would this have any value to other folks? > >>>>> > >>>>> - Josh > >>>>> > >>> > >> > > >