Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hbase/HBaseWireCompatibility" page has been changed by Misty:
https://wiki.apache.org/hadoop/Hbase/HBaseWireCompatibility?action=diff&rev1=9&rev2=10

- <<TableOfContents(5)>>
+ The HBase Wiki is in the process of being decommissioned. The info that used 
to be on this page has moved to 
http://hbase.apache.org/book.html#hbase.versioning. Please update your 
bookmarks.
  
- === Glossary ===
- ----
- ||<tableclass="confluenceTable"class="confluenceTh">Term 
||<class="confluenceTh">Definition ||
- ||<class="confluenceTd">Major version ||<class="confluenceTd">First number in 
the version, to the left of the period.  e.g. in version 2.3, the major version 
is "2" ||
- ||<class="confluenceTd">Minor version ||<class="confluenceTd">Second number 
in the version, immediately to the right of the period.  e.g. in version 2.3, 
the minor version is "3" ||
- ||<class="confluenceTd">Compatibility window ||<class="confluenceTd">Range of 
consecutive major versions where compatibility between two entities is 
guaranteed ||
- 
- 
- 
- 
- === Motivation and Goals ===
- ----
- The current lack of a concrete versioning story for HBase is limiting  from 
both an operational and development perspective.  We propose a  "first-pass" 
versioning story (that can be expanded upon later) that  addresses the 
following use cases and concerns:
- 
- '''Operations'''
- 
-  * '''Decouple client applications from HBase''':  HBase clients are  part of 
a separate application and often administrated separately from  the HBase 
cluster. Today, the application and cluster must be upgraded  in lockstep.  
Clients should interoperate with HBase RS's and masters  that are running 
different major versions.  This allows for the  following operational 
improvements:
-   * Multiple pods: HBase clients may write to multiple HBase clusters  / pods 
(sharded clusters) and the shards may be upgraded separately.
-   * Application-level replication: HBase installation with active and  
standby clusters should be able to upgrade, and HBase clients can work  with 
both.
-  * '''No downtime for minor version upgrades'''
- 
- '''Development'''
- 
-  * '''Simplified support for bugfixes, upgrades, and testing''' -  no need 
for specialized migration scripts
-  * '''Higher developer cadence in the community''' - can add functionality 
and not worry about breaking version compatibility
- 
- === Requirements ===
- ----
-  * HBase server-server running different '''minor''' versions shall 
interoperate in an extensible manner.
-  * HBase client-server running different '''major''' versions shall 
interoperate in an extensible manner.
-   * For example, in a scenario where client is running with version A  and 
server is running with version B: anything the other side does not  understand 
is ignored, provided defaults for, or otherwise handled in an  appropriate 
manner.
-  * Formats and protocols shall be extensible to allow for new functionality 
such as RPC tracing.
-  * Developers shall be able to augment RPC protocol with '''new''' methods 
within minor and major version upgrades.
-  * Critical path operations (Get/Put) performance shall suffer no more  than 
10% from the current 0.92 version's performance on YCSB load tests  (i.e. 
read/update/scan/insert should individually be no more than 10%  slower).
- 
- === Design ===
- ----
- ==== Wire format ====
- Protobuf vs. Thrift vs. Avro
- 
- We propose to use protobuf for wire format. The primary reason is  that the 
current HBase RPC engine (see HADOOP-7379) supports  protobuf-encoded data, and 
protobuf is relatively more stable than the  alternatives.  In addition, Hadoop 
RPC uses protobuf, and the community  may eventually want Hadoop and HBase to 
share the same RPC.
- 
- We also propose to change the HBase RPC connection header from  Writable to  
protobuf so that the HBase RPC is programming language  agnostic.
- 
- ==== RPC ====
- Currently, the HBase RPC engine does not support async IO or protocol  
negotiation.  These features don't impact compatibility and therefore  can 
evolve separately and are not in scope for this document.
- 
- ==== Interfaces ====
- 
- 
{{http://docs.google.com/a/cloudera.com/leaf?id=0BzYqRa05S66NMDcxMjUyYTMtZWE2Yy00ZmIyLThiMjgtMjJkNGU0NGU5OTg1}}
- 
-  1. Client talks to ZK to find out the location of the master and the root 
region server.
-  1. Client applications talk to RS using '''HRegionInterface''' to read 
from/write to/scan a table, etc..
-  1. Client applications talk to master using '''HMasterInterface''' to 
dynamically create a table, add a column family, and so on.
-  1. Master talks to RS using '''HRegionInterface''' to 
open/close/move/split/flush regions, and so on.
-  1. Master puts data in ZK to store the active master and root region  server 
location, create log splitting tasks, track RS's status, and so  on.
-  1. RS reads data in ZK to track log splitting tasks and update it to  grab a 
task and report status, create a node for the RS so that master  can track the 
status of this RS, track master location  and cluster  status, and so on.
-  1. RS talks to master using '''HMasterRegionInterface''' to report RS load, 
RS fatal errors, RS starts-up.
-  1. Occasionally, RS talks to root region or meta region with 
'''HRegionInterface''' to check the status of a region, create new daughter 
regions in region splitting, and so on.
- 
- === Phasing ===
- ----
- The order of phases is based on priority. They can be done in parallel if 
there are enough resources.
- 
- ==== Phase 0: HBASE-4403: Separate existing APIs into public and private 
interfaces ====
- In order to define which APIs can be changed, we need to separate existing 
APIs into public and private.
- 
- ==== Phase 1: Compatibility between client applications and HBase clusters 
====
- Goal:
- 
-  . To make HBase client applications work properly with HBase clusters of 
different major and minor versions.
- 
- Note: deal with 1, 2, 3 (we get 8 "for free") in the interface graph. These 
tasks can be sub-tasks of 
[[https://issues.apache.org/jira/browse/HBASE-5305|HBASE-5305 Improve 
cross-version compatibility & upgradeability]] or 
[[https://issues.apache.org/jira/browse/HBASE-5306|HBASE-5306 Add support for 
protocol buffer based RPC]].  HBASE-5306 can also include a new RPC engine (the 
latest Hadoop one). This plan focuses on the data encoding/decoding.
- 
- Tasks:
- 
-  * Replace RPC negotiation with extensible PB-based types
-  * Replace root and master address znodes in ZK with PB-enabled types  (goal: 
client's ZK interactions become extensible) (1 in the graph)
-  * Replace existing HRegionInterface calls for read from/write to/scan  a 
table...  with PB-enabled types (goal: client->RS and RS->RS  RPC becomes 
extensible) (2 in the graph)
-  * Replace existing HMasterInterface calls with PB-enabled types (goal: 
client->master RPC becomes extensible) (3 in the graph)
-  * Replace data stored in .META. and -ROOT- tables with PB-enabled  types 
(goal: client can read from old and/or new .META. and -ROOT-  tables) (2 in the 
graph)
- 
- ==== Phase 2: HBase cluster rolling upgrade within same major version ====
- Goal:
- 
-  . To make an HBase cluster able to roll upgrade within the same major version
- 
- Note: deal with 4, 5, 6, 7 in the interface graph.
- 
- Tasks:
- 
-  * Replace existing HRegionInterface calls for  open/close/move/split/flush 
regions... with PB-enabled types (goal:  master->RS RPC becomes extensible) (4 
in the graph)
-  * Replace Writables used in ZK for communication between RS and  master with 
PB-enabled types (goal: RS and master ZK interactions become  extensible) (5, 6 
in the graph)
-  * Replace existing HMasterRegionInterface calls with PB-enabled types  
(goal: RS->master RPC becomes extensible) (7 in the graph)
-  * Add version information to each server's ZK data (master and RS's)  (goal: 
tracking live version numbers, used for automatic wire-off of new  features in 
persistent data formats until all servers have hit new  version) (5, 6 in the 
graph)
-  * Add version information to RS's on master status UI
- 
- === Open questions ===
- ----
- '''Technical'''
- 
-  . - How does ZK security and HBase RPC security play into this? Should be 
orthogonal?
-  . - Should pluggable encodings (thrift/avro/pb/writable) be in scope?
-  . - Should async IO servers and clients be in scope or not?
- 
- '''Policy'''
- 
-  . - What is the policy for existing versions (89, 90,  92, 94) -- do we  
support them or require on major upgrade before they get this story?
-  . - Developers should be able to remove deprecated methods or arguments to  
maintain flexibility, but can't do that within the compatibility window.  What 
should be our compatibility window? 2 years (roughly 4  major versions)?
-  . - What is the ZK version interoperability story?
-  . - What is the HDFS version interoperability story?
-  . - Should architectural-level changes require a major version bump?
- 
- === Appendix ===
- ----
- ==== Future work (out of scope of this document) ====
-  * Possible to extend RPC with meta-data that can enable new functionality 
like RPC tracing
-  * Unify this with Hadoop RPC
-  * Online rolling upgrade of single cluster between major versions:  Today, 
major version upgrades of a single cluster require downtime to  upgrade all 
services in lockstep, while some minor versions updates can  be upgraded via 
the rolling-restart script.  HBase should remain  available through this 
process.
-  * Partial rollout: HBase clusters should allow for some nodes to  "try" a 
newer version for testing purposes.  Today, this is a manual  process and 
possible only within minor versions. (likely possible, would  like to not 
exclude this possibility).
-  * Cluster configuration changes: HBase should remain available as  
configuration changes (hbase-site.xml) or hotfixes are applied. Today,  
rolling-restart script can be used to perform this operation.
-  * Replication across different versions
-  * Disaster recovery: Operators should be able to smoke test a new  version 
during the rolling upgrade before turning on the new features  for general use. 
If anything is wrong during the rolling upgrade, it  should be able to roll 
back.
-  * ZK wire compatibility: is necessary for RPCs between different  versions 
of HBase and ZK.  Currently ZK supports backward compatibility  for one version 
only. Different versions of HBase could support  different ZK versions.
-  * HDFS wire compatibility
-  * Data format changes may prevent minor or major version roll-back.
-  * Security RPC data compression/encryption changes may prevent minor or 
major version roll-back
-  * Persistent Data is stored in version specific formats in HDFS (xml    
configs, regioninfo, tableinfo).  Some of these data encodings and    formats 
are directly exposed; for example, ZK is not exposed as an API.
- 
- === References ===
- ----
-  . Dapper: http://research.google.com/pubs/pub36356.html
-  . Cross version upgrade and compatibility: 
https://issues.apache.org/jira/browse/HBASE-5305
-  . Add protbuf based RPC to HBase: 
https://issues.apache.org/jira/browse/HBASE-5306
-  . Redo IPC/RPC: https://issues.apache.org/jira/browse/HBASE-2182
-  . HDFS wire compatibility: 
[[https://issues.apache.org/jira/browse/HADOOP-7347|HADOOP-7347]]
-  . HDFS client wire compatibility: 
[[https://issues.apache.org/jira/browse/HDFS-2060|HDFS-2060]]
-  . HDFS data protocol wire compatibility: 
[[https://issues.apache.org/jira/browse/HDFS-2058|HDFS-2058]]
-  . Use protobuf objects in existing IPC: 
[[https://issues.apache.org/jira/browse/HADOOP-7379|HADOOP-7379]]
- 
- === Meeting notes ===
- 
- * [[HBaseWireCompatibility20120221]]
- 

Reply via email to