Here are some of our thought on the discussions of the past few days with
respect to backwards compatibility.

In general at Twitter we're not necessarily against backwards incompatible
changes per se.











*It depends on the "Return on Pain". While it is hard to quantify the
returns in the abstract, I can try to sketch out which kinds of changes are
the most painful and therefore cause the most friction for us.In rough
order of increasing pain to deal with:a) There is a new upstream (3.x)
release, but it is so backwards incompatible, that we won't be able to
adopt it for the foreseeable future. Even though we don’t adopt it, it
still causes pain. Now development becomes that much harder because we'd
have to get a patch for trunk, a patch for 3.x and a patch for the 2.x
branch. Conversely if patches go into 2.x only, now the releases start
drifting apart. We already have (several dozen) patches in production that
have not yet made it upstream, but are striving to keep this list as short
as possible to reduce the rebase pain and risk.b) Central Daemons (RM, or
pairs of HA NNs) have to be restarted causing a cluster-wide outage. The
work towards work-preserving restart in progress in various areas makes
these kinds of upgrades less painful.c) Server-side requires different
runtime from client-side. We'd have to produce multiple artifacts, but we
could make that work. For example, NN code uses Java 8 features, but
clients can still use Java 7 to submit jobs and read/write HDFS.Now for the
more painful backwards incompatibilities:d) All clients have to recompile
(a token uses protobuf instead of thrift, an interface becomes an abstract
class or vice versa). Not only do these kinds of changes make a rolling
upgrade impossible, more importantly it requires all our clients to
recompile their code and redeploy their production pipelines in a
coordinated fashion. On top of this, we have multiple large production
clusters and clients would have to keep multiple incompatible pipelines
running, because we simply cannot upgrade all clusters in all datacenters
at the same time.e) Customers are forced to restart and can no longer run
with JDK 7 clients because job submission client code or HDFS has started
using JDK 8-only features. Eventually group will reduce, but for at least
another year if not more this will be very painful.f) Even more painful is
when Yarn/MapReduce APIs change so that customers not only have to
recompile, but also have to change hundreds of scripts / flows in order to
deal with the API change. This problem is compounded by other tools in the
Hadoop ecosystem that would have to deal with these changes. There would be
two different versions of Cascading, HBase, Hive, Pig, Spark, Tez, you name
it.g) Without proper classpath isolation, third party dependency changes
(guava, protobuf version, etc) are probably as painful as API changes.h)
HDFS client API get changed in a backwards incompatible way requiring all
clients to change their code, recompile and re-start their services in a
coordinated way. We have tens of thousands of production servers reading
from / writing to Hadoop and cannot have all of these long running clients
restart at the same time. To put these in perspective, despite us being one
of the early adopters of Hadoop 2 in production at the scale of many
thousands of nodes, we are still wrapping up the migration from our last
Hadoop 1 clusters. We have many war stories about many of the above
incompatibilities. As I've tweeted about publicly the gains have been
significant with this migration to Hadoop 2, but the friction has also been
considerable.To get specific about JDK 8, we are intending to move to Java
8. Right now we're letting clients choose to run tasks with JDK 8
optionally, then we'll make it default.We'll switch to the daemons running
with JDK 8. What we're concerned it would then be feasible to use JDK 8
features on the servers side (see c) above).I'm suggesting that if we do
allow backwards incompatible changes, we introduce an upgrade path through
an agreed upon stepping stone release.For example, a protocol changing from
thrift to protobuf can be done in steps. In the stepping-stone release both
would be accepted. in the following release (or two releases later) the
thrift version support is dropped.This would allow for a rolling upgrade,
or even if a cluster-wide restart is needed, at least customers can adopt
to the change at a pace of weeks or months. Once no more (important)
customers are running the thrift client, we could then roll to the next
release.It would be useful to coordinate the backwards incompatibilities so
that not every release becomes a stepping-stone release.*




*Cheers,Joep*




On Mon, Mar 9, 2015 at 6:04 PM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

> Hi Mayank,
>
>
> > 1. We would be moving to Hadoop -3 (Not this year though) however I don't
> > see we can do another JDK upgrade so soon. So the point I am trying to
> make
> > is we should be supporting jdk 7 as well for Hadoop-3.
> >
> > We'll still be releasing 2.x releases for a while, with similar
> featuresets as 3.x. You can keep using 2.x until you feel ready to jump to
> JDK8.
>
>
> > 2. For the sake of JDK 8 and classpath isolation we shouldn't be making
> > another release as those can be supported in Hadoop 2 as well, so what is
> > the motivation of making Hadoop 3 so soon?
> >
> > So already you can run 2.x with JDK8 and some degree of classpath
> isolation, but I've discussed the motivation for a 3.0 on the previous
> thread. We had issues in the JDK6 days with our dependencies not supporting
> JDK6 and thus not releasing security or bug fixes, which in turn put us in
> a bad spot. Classpath isolation we are still discussing, but right now it's
> opt-in and somewhat incomplete, which makes it hard for downstream projects
> to effectively make use of it. The goal for 3.0 is to clean this up and
> have it on by default (or always).
>
> Best,
> Andrew
>

Reply via email to