Re: [DISCUSS] packaging our dependencies

Christopher Mon, 12 May 2014 16:31:32 -0700

Does that mean package everything else?
What about ZooKeeper?

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii



On Mon, May 12, 2014 at 3:38 PM, Joey Echeverria <[email protected]> wrote:
> +1 to only depending on Hadoop client jars.
>
>
> --
> Joey Echeverria
> Chief Architect
> Cloudera Government Solutions
>
>
> On Sun, May 11, 2014 at 6:07 PM, Christopher <[email protected]> wrote:
>> In general, I think this is reasonable... especially because Hadoop
>> Client stabilizes things a bit. On the other hand, things get really
>> complicated with dependencies in the pom (somewhat complicated), and
>> packaged dependencies (more complicated), when we're talking about
>> supporting both Hadoop 1 and Hadoop 2. I know some of us want to drop
>> Hadoop 1 support in 2.0.0, and I think this is one more good reason to
>> do that.
>>
>> Another data point that I think is going to complicate things a (very)
>> tiny bit: the work on ACCUMULO-2589 includes things like: drop the
>> dependencies on Hadoop from the API. But, we're likely to still have a
>> dependency on guava (there was a suggestion to use guava's @Beta
>> annotations in the API). Maybe this is fine.... because the packaging
>> considerations for the binary tarball are not the same as the API
>> module dependencies (though they'll have to be compatible), but it's
>> something to consider.
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Sun, May 11, 2014 at 4:45 PM, Sean Busbey <[email protected]> wrote:
>>> ACCUMULO-2786 has brought up the issue of what dependencies we bring with
>>> Accumulo rather than depend on the environment providing[1].
>>>
>>> Christopher explains our extant reasoning thus
>>>
>>>> The precedent has been: if vanilla Apache Hadoop provides it in its bin
>>> tarball, we don't need to.
>>>
>>> I'd like us to move to packaging any dependencies that aren't brought in by
>>> Hadoop Client.
>>>
>>> 1) Our existing practice developed before Hadoop Client existed, so we
>>> essentially *had* to have all of the Hadoop related deps on our classpath.
>>> For versions where we default to Hadoop 2, we can improve things.
>>>
>>> 2) We should encourage users to follow good practice by minimizing the
>>> number of jars added to the classpath.
>>>
>>> 3) We have to still include the jars found in Hadoop Client because we use
>>> hadoop.
>>>
>>> 4) Limiting the dependencies we rely on external sources to provide allows
>>> us to update more of our dependencies to current versions.
>>>
>>> 5) Minimizing the number of jars we rely on from external sources reduces
>>> the chances that they change out from under us (and thus reduces the number
>>> of external factors we have to remain cognizant of)
>>>
>>> 6) Minimizing the classpath reduces the chances of having multiple
>>> different versions of the same library present.
>>>
>>> I'd also like for us to *not* package any of the jars brought in by Hadoop
>>> Client. Due to the additional work it would take to downgrade our version
>>> of guava, I'd like to wait to do that.
>>>
>>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2786
>>>
>>> --
>>> Sean

Re: [DISCUSS] packaging our dependencies

Reply via email to