> > > Furthermore, "hadoop jar" is how you're supposed to launch YARN apps. If > we > > say that doing things via the hbase command is acceptable, we're opening > > ourselves up to an expansion of what the hbase command has to do. (i.e. > > perhaps it should detect if the passed class is a YARN driver and then > use > > the hadoop jar command? or should it always pass through to the hadoop > jar > > command?) > > > Traditionally, and in our documentation, HBase owned MR classes (CopyTable, > Import, etc) are run > with the hbase script, not the hadoop script. It is a regression in that > sense still. Yes, there is a > workaround, but why we bother where we can fix this easily.
Can we side-step some of the issues here by fixing the hbase script for launching jobs? On Wed, Mar 11, 2015 at 3:19 PM, Enis Söztutar <[email protected]> wrote: > On Wed, Mar 11, 2015 at 3:07 PM, Sean Busbey <[email protected]> wrote: > > > On Wed, Mar 11, 2015 at 4:49 PM, Enis Söztutar <[email protected]> > wrote: > > > > > > > > > > It's worth noting that if users follow our ref guide (which says to > use > > > > "hadoop jar"), then jobs don't fail. It's only when they attempt to > > > launch > > > > jobs using "hbase com.example.MyDriver" that things fail. > > > > > > > > Additionally, if we stick to telling users that only the "hadoop jar" > > > > version is supported, we can rely on the application classpath > support > > > > built into Hadoop 2.6+ to make it so jobs built on us get our > > dependency > > > > version and not the ones from Hadoop as it changes. > > > > > > > > > > We have learned that the users do not read or follow documentation. And > > it > > > is a regression > > > if launching job using hbase command does not work. > > > > > > > > > > > They do when things break. ;) An additional troubleshooting section that > > shows the error and says "remember to use hadoop jar" would nicely help > > catch searchers. > > > > Furthermore, "hadoop jar" is how you're supposed to launch YARN apps. If > we > > say that doing things via the hbase command is acceptable, we're opening > > ourselves up to an expansion of what the hbase command has to do. (i.e. > > perhaps it should detect if the passed class is a YARN driver and then > use > > the hadoop jar command? or should it always pass through to the hadoop > jar > > command?) > > > > Traditionally, and in our documentation, HBase owned MR classes (CopyTable, > Import, etc) are run > with the hbase script, not the hadoop script. It is a regression in that > sense still. Yes, there is a > workaround, but why we bother where we can fix this easily. > > > > > > > > > > > > > > > > > > > > > > > > > So, my proposal is: > > > > > - Commit HBASE-13149 to master and 1.1 > > > > > - Either change the dependency compat story for minor versions to > > > false, > > > > > or add a footnote saying that there may be exceptions because of > the > > > > > reasons listed above. > > > > > > > > > > > > > > > > > If we decide we need to do the jackson version bump, what about the > > > > possibility of moving the code in branch-1 to be version 2.0.0 (and > > > making > > > > master 3.0.0). We could start the release process once the changes > > Andrew > > > > needs for Phoenix are in place and get it out the door. > > > > > > > > > > I don't think this requires a major version bump. As I was mentioning > in > > > the other > > > thread, HBase is not upgraded too frequently in production. Again, we > do > > > not want > > > to inconvenience the user even further. > > > > > > > > > > > How would this inconvenience users further? Barring the change in version > > numbers, it's the same upgrade they would be doing to move to what we're > > currently calling HBase 1.1. Since version numbers under semver signal > what > > we understand about our changeset, it's just us acknowledging that we > broke > > some kind of compatibility. A release note that calls out the Jackson > > dependency as the cause for that compatibility breakage makes the > > evaluation easy. > > > > The problem is boils down to "major versions are cheap" kind of argument, > which have > been discussed in Hadoop context. I do not buy it, because a major version > upgrade implies > (though do not have to be) a big change. I don't see why ever we would want > to bump > our major version, where the said library only bumped their minor version. > Jackson could > have went with 2.0 for those changes between 1.8 and 1.9. Why would we want > to > promise more than what our dependencies promise? It is not realistic. > > > > > > > In the current state of the code, we'd just need to make some > documentation > > changes and then the same upgrade paths as for 1.1 should work just fine. > > Provided we don't take too long getting the release out, I'd expect many > > users would just upgrade from 0.98 to (the proposed) 2.0.0. > > > > (I mentioned the changes Andrew needs only because it's my understanding > > that those are the driving factor on branch-1 getting to release, not > > because I expect them to be breaking.) > > > > > > > > > > > > It would do a nice job of desensitizing us to major version > increments > > > and > > > > we'd be able to document it as a very safe major version upgrade > since > > > the > > > > only breakage is that dependency. We could then limit the HBase 1.y > > line > > > to > > > > just 1.0.z and add a FAQ item if enough folks ask about why the > sudden > > > > increment. > > > > > > > > > > Doing a major version just to update one dependency version is too > much I > > > think. > > > > > > > > But that's the point of following semver and defining a compatibility > > document. The sufficient criteria for a major version bump expressly > covers > > updating a single dependency in a non-breaking way. > > > > There will be plenty of major version numbers to go through. The thing > that > > trips projects up is feeling like major version releases need to be > > special. If we want to do that, then we shouldn't use semver. We should > > define our own versioning standard and make it "Marketing, Major, Minor" > > instead of "Major, Minor, Patch." (I would prefer we not do this.) > > > > > > > > > > > > > > I'm -1 on the idea of exceptions for our compatibility story. We > > already > > > > note that just because we can break something doesn't mean we will. > > That > > > > does a good job of pointing out that we recognize there's a cost. > > > > > > > > > > We do not have to corner ourselves with the rules we have set. I can > see > > > how requiring > > > JDK-8 or Hadoop-3 etc will justify major versions. But not a dependency > > > library that > > > users might be transitively depending on. If that is the case, the user > > is > > > expected to deal with it. > > > > > > > > If we want to treat those differently then we need to update our > > compatibility document to call out JVM and Hadoop support as a different > > thing then the rest of our dependency promises. But we should not do > this. > > So long as we are forcing applications that integrate with us to use > > particular versions of third party libraries, we make it much harder to > > upgrade when we don't provide stability. > > > > -- > > Sean > > >
