Does that mean package everything else? What about ZooKeeper? -- Christopher L Tubbs II http://gravatar.com/ctubbsii
On Mon, May 12, 2014 at 3:38 PM, Joey Echeverria <[email protected]> wrote: > +1 to only depending on Hadoop client jars. > > > -- > Joey Echeverria > Chief Architect > Cloudera Government Solutions > > > On Sun, May 11, 2014 at 6:07 PM, Christopher <[email protected]> wrote: >> In general, I think this is reasonable... especially because Hadoop >> Client stabilizes things a bit. On the other hand, things get really >> complicated with dependencies in the pom (somewhat complicated), and >> packaged dependencies (more complicated), when we're talking about >> supporting both Hadoop 1 and Hadoop 2. I know some of us want to drop >> Hadoop 1 support in 2.0.0, and I think this is one more good reason to >> do that. >> >> Another data point that I think is going to complicate things a (very) >> tiny bit: the work on ACCUMULO-2589 includes things like: drop the >> dependencies on Hadoop from the API. But, we're likely to still have a >> dependency on guava (there was a suggestion to use guava's @Beta >> annotations in the API). Maybe this is fine.... because the packaging >> considerations for the binary tarball are not the same as the API >> module dependencies (though they'll have to be compatible), but it's >> something to consider. >> >> -- >> Christopher L Tubbs II >> http://gravatar.com/ctubbsii >> >> >> On Sun, May 11, 2014 at 4:45 PM, Sean Busbey <[email protected]> wrote: >>> ACCUMULO-2786 has brought up the issue of what dependencies we bring with >>> Accumulo rather than depend on the environment providing[1]. >>> >>> Christopher explains our extant reasoning thus >>> >>>> The precedent has been: if vanilla Apache Hadoop provides it in its bin >>> tarball, we don't need to. >>> >>> I'd like us to move to packaging any dependencies that aren't brought in by >>> Hadoop Client. >>> >>> 1) Our existing practice developed before Hadoop Client existed, so we >>> essentially *had* to have all of the Hadoop related deps on our classpath. >>> For versions where we default to Hadoop 2, we can improve things. >>> >>> 2) We should encourage users to follow good practice by minimizing the >>> number of jars added to the classpath. >>> >>> 3) We have to still include the jars found in Hadoop Client because we use >>> hadoop. >>> >>> 4) Limiting the dependencies we rely on external sources to provide allows >>> us to update more of our dependencies to current versions. >>> >>> 5) Minimizing the number of jars we rely on from external sources reduces >>> the chances that they change out from under us (and thus reduces the number >>> of external factors we have to remain cognizant of) >>> >>> 6) Minimizing the classpath reduces the chances of having multiple >>> different versions of the same library present. >>> >>> I'd also like for us to *not* package any of the jars brought in by Hadoop >>> Client. Due to the additional work it would take to downgrade our version >>> of guava, I'd like to wait to do that. >>> >>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2786 >>> >>> -- >>> Sean
