Re: [DISCUSS] Dependency compatibility

Andrey Stepachev Wed, 11 Mar 2015 15:38:22 -0700

Hi,

With all those initiatives like ODP, it seems that
we need to build infrastructure for building versions
for major distros of hadoop versions. Currently many
projects do this (spark for example).


So, regardless of what will be in 1.1, should we think about
hbase-hadoop-2.5 or hbase-hadoop-2.6 builds?
(and consequently clean up dependencies in a way, that
they will come from hadoop, not by our own poms?

That is just an idea. In the past I had experience with
custom hbase and that would be great to have easy
drop in change of what hadoop version will be used .

Thanks.

On Wed, Mar 11, 2015 at 10:26 PM, Sean Busbey <[email protected]> wrote:

> On Wed, Mar 11, 2015 at 5:19 PM, Enis Söztutar <[email protected]> wrote:
>
> > On Wed, Mar 11, 2015 at 3:07 PM, Sean Busbey <[email protected]>
> wrote:
> >
> > > On Wed, Mar 11, 2015 at 4:49 PM, Enis Söztutar <[email protected]>
> > wrote:
> > >
> > > > >
> > > > > It's worth noting that if users follow our ref guide (which says to
> > use
> > > > > "hadoop jar"), then jobs don't fail. It's only when they attempt to
> > > > launch
> > > > > jobs using "hbase com.example.MyDriver" that things fail.
> > > > >
> > > > > Additionally, if we stick to telling users that only the "hadoop
> jar"
> > > > > version is supported, we can rely on the application classpath
> > support
> > > > > built into Hadoop 2.6+ to make it so jobs built on us get our
> > > dependency
> > > > > version and not the ones from Hadoop as it changes.
> > > > >
> > > >
> > > > We have learned that the users do not read or follow documentation.
> And
> > > it
> > > > is a regression
> > > > if launching job using hbase command does not work.
> > > >
> > > >
> > > >
> > > They do when things break. ;) An additional troubleshooting section
> that
> > > shows the error and says "remember to use hadoop jar" would nicely help
> > > catch searchers.
> > >
> > > Furthermore, "hadoop jar" is how you're supposed to launch YARN apps.
> If
> > we
> > > say that doing things via the hbase command is acceptable, we're
> opening
> > > ourselves up to an expansion of what the hbase command has to do. (i.e.
> > > perhaps it should detect if the passed class is a YARN driver and then
> > use
> > > the hadoop jar command? or should it always pass through to the hadoop
> > jar
> > > command?)
> > >
> >
> > Traditionally, and in our documentation, HBase owned MR classes
> (CopyTable,
> > Import, etc) are run
> > with the hbase script, not the hadoop script. It is a regression in that
> > sense still. Yes, there is a
> > workaround, but why we bother where we can fix this easily.
> >
> >
> All of our current ref guide examples use the "hadoop jar" command:
>
> http://hbase.apache.org/book.html#hbase.mapreduce.classpath
>
> http://hbase.apache.org/book.html#_bundled_hbase_mapreduce_jobs
>
> They only rely on the hbase command to get things that need to be added to
> hte hadop classpath.
>
>
>
> >
> > >
> > >
> > >
> > > > >
> > > > >
> > > > >
> > > > > > So, my proposal is:
> > > > > >  - Commit HBASE-13149 to master and 1.1
> > > > > >  - Either change the dependency compat story for minor versions
> to
> > > > false,
> > > > > > or add a footnote saying that there may be exceptions because of
> > the
> > > > > > reasons listed above.
> > > > > >
> > > > >
> > > > >
> > > > > If we decide we need to do the jackson version bump, what about the
> > > > > possibility of moving the code in branch-1 to be version 2.0.0 (and
> > > > making
> > > > > master 3.0.0). We could start the release process once the changes
> > > Andrew
> > > > > needs for Phoenix are in place and get it out the door.
> > > > >
> > > >
> > > > I don't think this requires a major version bump. As I was mentioning
> > in
> > > > the other
> > > > thread, HBase is not upgraded too frequently in production. Again, we
> > do
> > > > not want
> > > > to inconvenience the user even further.
> > > >
> > > >
> > > >
> > > How would this inconvenience users further? Barring the change in
> version
> > > numbers, it's the same upgrade they would be doing to move to what
> we're
> > > currently calling HBase 1.1. Since version numbers under semver signal
> > what
> > > we understand about our changeset, it's just us acknowledging that we
> > broke
> > > some kind of compatibility. A release note that calls out the Jackson
> > > dependency as the cause for that compatibility breakage makes the
> > > evaluation easy.
> > >
> >
> > The problem is boils down to "major versions are cheap" kind of argument,
> > which have
> > been discussed in Hadoop context. I do not buy it, because a major
> version
> > upgrade implies
> > (though do not have to be) a big change. I don't see why ever we would
> want
> > to bump
> > our major version, where the said library only bumped their minor
> version.
> > Jackson could
> > have went with 2.0 for those changes between 1.8 and 1.9. Why would we
> want
> > to
> > promise more than what our dependencies promise? It is not realistic.
> >
> >
>
> But we _know_ that 1.8 -> 1.9 in jackson is not really a minor version in
> the semver sense. we _know_ it's a breaking change. Their release notes
> even call out breaking changes, so we know it isn't just a change in
> internals.
>
>
> --
> Sean
>



-- 
Andrey.

Re: [DISCUSS] Dependency compatibility

Reply via email to