Editorial: "rollback" should be restricted to exactly the function of returning to the state before an upgrade was started. It's OK to discuss other desirable features along with descriptive names, but "easy rollback," "partial rollback," "version rollback" and the like are all confusing.
Substance: In speaking of version numbers, don't confuse desired behavior (what client can connect with which server) with the details of implementation (disk formats did or did not change). We want to avoid getting squeezed by the argument that some feature must wait a year because it modifies the disk format but it too early to change the major version number. On 27 10 08 13:50, "Sanjay Radia" <[EMAIL PROTECTED]> wrote: > I have merged the various Hadoop 1.0 Compatibility items that have > been discussed in this thread and categorized and listed them below. > > > > Hadoop 1.0 Compatibility > ================== > > Standard release numbering: > - Only bug fixes in dot releases: m.x.y > - no changes to API, disk format, protocols or config etc. > - new features in major (m.x.0) and minor (m.x.0) releases > > > 1. API Compatibility > ------------------------- > No need for client recompilation when upgrading across minor releases > (ie. from m.x to m.y, where x <= y) > Classes or methods deprecated in m.x can be removed in (m+1).0 > Note that this is stronger than what we have been doing in Hadoop 0.x > releases. > Motivation. This the industry standard compatibility rules for major > and > minor releases. > > 2 Data Compatibility > -------------------------- > 2.a HDFS metadata and data can change across minor or major releases , > but such > changes are transparent to user application. That is release upgrade > must > automatically convert the metadata and data as needed. Further, a > release > upgrade must allow a cluster to roll back to the older version and its > older > disk format. > Motivation: Users expect File systems preserve data transparently > across > releases. > > 2.a-Stronger > HDFS metadata and data can change across minor or major releases, but > such > changes are transparent to user application. That is release upgrade > must > automatically convert the metadata and data as needed. During *minor* > releases, > disk format changes have to backward and forward compatible; i.e. an > older > version of Hadoop can be started on a newer version of the disk > format. Hence > a version roll back is simple, just restart the older version of Hadoop. > Major releases allow more significant changes to the disk format and > have be > only backward compatible; however major release upgrade must allow a > cluster to > roll back to the older version and its older disk format. > Motivation: Minor release are very easy to roll back for an admin. > > > 2.a-WeakerAutomaticConversion: > Automatic conversion is supported across a small number of releases. > If a user > wants to jump across multiple releases he may be forced to go through > a few > intermediate release to get to the final desired release. > > 3. Wire Protocol Compatibility > ---------------------------------------- > We offer no wire compatibility in our 0.x release today. > The motivation *isn't* to make a our protocols public. Applications > will not > call the protocol directly but through a library (in our case > FileSystem class > and its implementations). Instead the motivation is that customers run > multiple clusters and have apps that access data across clusters. > Customers > cannot be expected to update all clusters simultaneously. > > > 3.a Old m.x clients can connect to new m.y servers, where x <= y but > the old clients might get reduced functionality or performance. m.x > clients might not be able to connect to (m+1).z servers > > 3.b. New m.y clients must be able to connect to old m.x server, where > x< y but > only for old m.x functionality. > Comment: Generally old API methods continue to use old rpc methods. > However, it is legal to have new implementations of old API methods > call new > rpcs methods, as long as the library transparently handles the fallback > case for old servers. > > 3.c. At any major release transition [ ie from a release m.x to a > release (m+1).0], a user should be able to read data from the cluster > running the old version. (OR shall we generalize this to: from m.x to > (m+i).z ?) > > Motivation: data copying across clusters is a common operation for many > customers. For example this is routinely at done at Yahoo; another use > case is > HADOOP-4058. Today, http (or hftp) provides a guaranteed compatible > way of > copying data across versions. Clearly one cannot force a customer to > simultaneously update all its Hadoop clusters on to a new major > release. The > above documents this requirement; we can satisfy it via the http/hftp > mechanism > or some other mechanism. > > 3.c-Stronger > Shall we add a stronger requirement for 1. 0 : wire compatibility > across major versions? That is not just for reading but for all > operations. This can be supported by class loading or other games. > Note we can wait to provide this when 2. 0 happens. If Hadoop > provided this guarantee then it would allow customers to partition > their data across clusters without risking apps breaking across major > releases due to wire incompatibility issues. > > Motivation: Data copying is a compromise. Customers really want to run > apps across clusters running different versions. ( See item 2 ) > > > 4. Intra Hadoop Service Compatibility > -------------------------------------------------- > The HDFS Service has multiple components (NN, DN, Balancer) that > communicate > amongst themselves. Similarly the MapReduce service has > components (JR and TT) that communicate amongst themselves. > Currently we require that the all the components of a service have the > same build version and hence talk the same wire protocols. > This build-version checking prevents rolling upgrades. It has the > benefit that the admin can ensure that the entire cluster has exactly > the same build version. > > 4.a HDFS and MapReduce require that their respective sub-components > have the same build version in order to form a cluster. > [ie. Maintain the current mechanism.] > > 4.a-Stronger: Intra-service wire-protocol compatibility > [I am listing this here to document it, but I don't think we are ready > to take > this on for Hadoop 1.0. Alternatively, we could require intra -service > wire > compatibility but check for build version till we are ready for rolling > upgrades] > > Wire protocols between internal Hadoop components are compatible across > minor versions. > Examples are NN-DN, DN-DN and NN-Balancer, etc. > Old m.x components can talk to new m.y components (x<=y) > Wire compatibility can break across major versions. > Motivation: Allow rolling upgrades. >
