+1 I remember kind of having this discussion in June because I wanted to be able to run the minicluster as a single node accumulo using the start package.
I like this approach better. 1.6.0 provides a main method for firing up the minicluster and having the dependencies in the pom will allow developers to fire it up without needing Hadoop/Zookeeper installed. ACCUMULO-1405 <https://issues.apache.org/jira/browse/ACCUMULO-1405> On Tue, Sep 24, 2013 at 12:48 PM, Josh Elser <josh.el...@gmail.com> wrote: > On Tue, Sep 24, 2013 at 12:31 PM, Keith Turner <ke...@deenlo.com> wrote: > > On Tue, Sep 24, 2013 at 11:57 AM, Josh Elser <josh.el...@gmail.com> > wrote: > > > >> I'm curious to hear what people think on this. > >> > >> I'm a really big fan of spinning up a minicluster instance to do some > >> "more real" testing of software as I write it. > >> > >> With 1.5.0, it's a bit more painful because I have to add a bunch more > >> dependencies to my project (which previously would only have to depend > >> on the accumulo-minicluster artifact). The list includes, but is > >> likely not limited to, commons-io, commons-configuration, > >> hadoop-client, zookeeper, log4j, slf4j-api, slf4j-log4j12. > >> > >> Best as I understand it, the intent of this was that Hadoop will > >> typically provide these artifacts at runtime, and therefore Accumulo > >> doesn't need to re-bundle them itself which I'd agree with (not > >> getting into that whole issue about the Hadoop "ecosystem"). However, > >> I would think that the minicluster should have non-provided scope > >> dependencies declared on these, as there is no Hadoop installation -- > >> > > > > Would this require declaring dependencies on a particular version of > hadoop > > in the minicluster pom? Or could the minicluster pom have profiles for > > different hadoop versions? I do not know enough about maven to know if > you > > can use profiles declared in a dependency (e.g. if a user depends on > > minicluster, can they activate profiles in it?) > > The actual dependency in minicluster is against Apache Hadoop but > that's besides the point. > > By marking the hadoop-client dependency as provided that means that > Hadoop's dependencies are *not* included at runtime (because hadoop is > provided, and, as such, so are its dependencies). In other words, this > is completely beside the point of what's actually included in a > distribution of Hadoop when you download and install it. > > Apache Hadoop has dependencies we need to run minicluster. By marking > the hadoop-client artifact as 'provided', we do not get its > dependencies and the minicluster fails to run. I think this is easy > enough to work around by overriding the dependencies we need to run > the minicluster in the minicluster module (e.g. make the hadoop-client > not 'provided' in the minicluster module). Thus, as we add more things > to the minicluster that require other libraries, we control the > dependency mgmt instead of forcing that onto the user. > > > > > > >> there's just the minicluster. As such, this would alleviate users from > >> having to dig into our dependency management or trial&error to figure > >> out what "extra" dependencies they have to include in their project to > >> actually make it work > >> > >> Thoughts? > >> > >> - Josh > >> >