+Bigtop

On Wed, May 16, 2012 at 12:50 PM, Alejandro Abdelnur <t...@cloudera.com> wrote:
> A while ago I've raise this issue in Pig
>
> This is an issue that most if not all projects (hbase, pig, sqoop,
> hive, oozie,...) based on Hadoop will face.
>
> It would be great if all these projects come up with a consistent way
> of doing this.
>
> Any idea how to tackle it? Starting the discusion all dev aliases?

This is something we've pondered in Bigtop. Our current thinking
is that while it is probably Ok to lean on the "leaf-node" (think Pig,
Hive, to some extend HBase) projects to at least take Hadoop
compatibility into account, the full problem is going to combinatorically
explode pretty soon.

Take Hive as an example -- for that project just taking care of Hadoop
is not enough, if there are incompatiblities between HBase release
Hive needs to publish HxB matrix of artifacts where H is the # of incomp.
Hadoop versions and B is the # of incomp. HBase versions. And that
doesn't take into account the fact that Hive might be interested in
publishing different artifacts to begin with (think -security
artifacts in HBase).
This gets pretty ugly pretty quickly.

Oh, and don't forget that somebody has to test all of the above.

Now, it seems like in Bigtop we're going to soon expose the Maven repo
with all of the Maven artifacts constituting a particular Bigtop "stack". You
could think of it as a transitive closure of all of the deps. built against
each other. This, of course, will not tackle an issue of a random combination
of components (we only support the versions of components as
specified in our own BOM for each particular Bigtop release) but it will
provide a pretty stable body of Maven artifacts that are KNOWN (as
in tested) to be compiled against each other.

If this sounds interesting and useful for upstream projects -- I'd invite
the continuation of this discussion to happen on bigtop-dev@.

Thanks,
Roman.

Reply via email to