Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hbase/HBaseWireCompatibility" page has been changed by JimmyXiang: http://wiki.apache.org/hadoop/Hbase/HBaseWireCompatibility?action=diff&rev1=1&rev2=2 + <<TableOfContents(5)>> - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Glossary|Glossary]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-MotivationandGoals|Motivation and Goals]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Requirements|Requirements]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Design|Design]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Wireformat|Wire format]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-RPC|RPC]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Interfaces|Interfaces]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Phasing|Phasing]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Phase0:HBASE4403:SeparateexistingAPIsintopublicandprivateinterfaces|Phase 0: HBASE-4403: Separate existing APIs into public and private interfaces]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Phase1:CompatibilitybetweenclientapplicationsandHBaseclusters|Phase 1: Compatibility between client applications and HBase clusters]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Phase2:HBaseclusterrollingupgradewithinsamemajorversion|Phase 2: HBase cluster rolling upgrade within same major version]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Openquestions|Open questions]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Appendix|Appendix]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-Futurework(outofscopeofthisdocument)|Future work (out of scope of this document)]] - * [[https://wiki.cloudera.com/display/engineering/HBase+wire+compatibility+plan#HBasewirecompatibilityplan-References|References]] === Glossary === ---- - ||<tableclass="confluenceTable"class="confluenceTh">Term||<class="confluenceTh">Definition|| + ||<tableclass="confluenceTable"class="confluenceTh">Term ||<class="confluenceTh">Definition || - ||<class="confluenceTd">Major version||<class="confluenceTd">First number in the version, to the left of the period. e.g. in version 2.3, the major version is "2"|| + ||<class="confluenceTd">Major version ||<class="confluenceTd">First number in the version, to the left of the period. e.g. in version 2.3, the major version is "2" || - ||<class="confluenceTd">Minor version||<class="confluenceTd">Second number in the version, immediately to the right of the period. e.g. in version 2.3, the minor version is "3"|| + ||<class="confluenceTd">Minor version ||<class="confluenceTd">Second number in the version, immediately to the right of the period. e.g. in version 2.3, the minor version is "3" || - ||<class="confluenceTd">Compatibility window||<class="confluenceTd">Range of consecutive major versions where compatibility between two entities is guaranteed|| + ||<class="confluenceTd">Compatibility window ||<class="confluenceTd">Range of consecutive major versions where compatibility between two entities is guaranteed || + + === Motivation and Goals === @@ -30, +18 @@ '''Operations''' - * * '''Decouple client applications from HBase''': HBase clients are part of a separate application and often administrated separately from the HBase cluster. Today, the application and cluster must be upgraded in lockstep. Clients should interoperate with HBase RS's and masters that are running different major versions. This allows for the following operational improvements: + * '''Decouple client applications from HBase''': HBase clients are part of a separate application and often administrated separately from the HBase cluster. Today, the application and cluster must be upgraded in lockstep. Clients should interoperate with HBase RS's and masters that are running different major versions. This allows for the following operational improvements: - * Multiple pods: HBase clients may write to multiple HBase clusters / pods (sharded clusters) and the shards may be upgraded separately. + * Multiple pods: HBase clients may write to multiple HBase clusters / pods (sharded clusters) and the shards may be upgraded separately. - * Application-level replication: HBase installation with active and standby clusters should be able to upgrade, and HBase clients can work with both. + * Application-level replication: HBase installation with active and standby clusters should be able to upgrade, and HBase clients can work with both. - * '''No downtime for minor version upgrades''' + * '''No downtime for minor version upgrades''' '''Development''' - * * '''Simplified support for bugfixes, upgrades, and testing''' - no need for specialized migration scripts + * '''Simplified support for bugfixes, upgrades, and testing''' - no need for specialized migration scripts - * '''Higher developer cadence in the community''' - can add functionality and not worry about breaking version compatibility + * '''Higher developer cadence in the community''' - can add functionality and not worry about breaking version compatibility === Requirements === ---- @@ -51, +39 @@ === Design === ---- - ===== Wire format ===== + ==== Wire format ==== Protobuf vs. Thrift vs. Avro We propose to use protobuf for wire format. The primary reason is that the current HBase RPC engine (see HADOOP-7379) supports protobuf-encoded data, and protobuf is relatively more stable than the alternatives. In addition, Hadoop RPC uses protobuf, and the community may eventually want Hadoop and HBase to share the same RPC. We also propose to change the HBase RPC connection header from Writable to protobuf so that the HBase RPC is programming language agnostic. - ===== RPC ===== + ==== RPC ==== Currently, the HBase RPC engine does not support async IO or protocol negotiation. These features don't impact compatibility and therefore can evolve separately and are not in scope for this document. - ===== Interfaces ===== + ==== Interfaces ==== - {{https://wiki.cloudera.com/download/temp/graphviz8261645797990359928.png?contentType=image/png&delete=true}} 1. Client talks to ZK to find out the location of the master and the root region server. + {{https://wiki.cloudera.com/download/temp/graphviz8261645797990359928.png?contentType=image/png&delete=true}} + + 1. Client talks to ZK to find out the location of the master and the root region server. 1. Client applications talk to RS using '''HRegionInterface''' to read from/write to/scan a table, etc.. 1. Client applications talk to master using '''HMasterInterface''' to dynamically create a table, add a column family, and so on. 1. Master talks to RS using '''HRegionInterface''' to open/close/move/split/flush regions, and so on. @@ -75, +65 @@ ---- The order of phases is based on priority. They can be done in parallel if there are enough resources. - ===== Phase 0: HBASE-4403: Separate existing APIs into public and private interfaces ===== + ==== Phase 0: HBASE-4403: Separate existing APIs into public and private interfaces ==== In order to define which APIs can be changed, we need to separate existing APIs into public and private. - ===== Phase 1: Compatibility between client applications and HBase clusters ===== + ==== Phase 1: Compatibility between client applications and HBase clusters ==== Goal: + - To make HBase client applications work properly with HBase clusters of different major and minor versions. + . To make HBase client applications work properly with HBase clusters of different major and minor versions. Note: deal with 1, 2, 3 (we get 8 "for free") in the interface graph. @@ -92, +83 @@ * Replace existing HMasterInterface calls with PB-enabled types (goal: client->master RPC becomes extensible) (3 in the graph) * Replace data stored in .META. and -ROOT- tables with PB-enabled types (goal: client can read from old and/or new .META. and -ROOT- tables) (2 in the graph) - ===== Phase 2: HBase cluster rolling upgrade within same major version ===== + ==== Phase 2: HBase cluster rolling upgrade within same major version ==== Goal: + - To make an HBase cluster able to roll upgrade within the same major version + . To make an HBase cluster able to roll upgrade within the same major version Note: deal with 4, 5, 6, 7 in the interface graph. @@ -109, +101 @@ === Open questions === ---- '''Technical''' + - - How does ZK security and HBase RPC security play into this -- (should be orthogonal, but please make this clearer). + . - How does ZK security and HBase RPC security play into this? Should be orthogonal? - - Should pluggable encodings (thrift/avro/pb/writable) be in scope? + . - Should pluggable encodings (thrift/avro/pb/writable) be in scope? - - Should async IO servers and clients be in scope or not? + . - Should async IO servers and clients be in scope or not? '''Policy''' + - - What is the policy for existing versions (89, 90, 92, 94) -- do we support them or require on major upgrade before they get this story? + . - What is the policy for existing versions (89, 90, 92, 94) -- do we support them or require on major upgrade before they get this story? - - Developers should be able to remove deprecated methods or arguments to maintain flexibility, but can't do that within the compatibility window. What should be our compatibility window? 2 years (roughly 4 major versions)? + . - Developers should be able to remove deprecated methods or arguments to maintain flexibility, but can't do that within the compatibility window. What should be our compatibility window? 2 years (roughly 4 major versions)? - - What is the ZK version interoperability story? + . - What is the ZK version interoperability story? + . - What is the HDFS version interoperability story? - - Should architectural-level changes require a major version bump? + . - Should architectural-level changes require a major version bump? === Appendix === ---- - ===== Future work (out of scope of this document) ===== + ==== Future work (out of scope of this document) ==== * Possible to extend RPC with meta-data that can enable new functionality like RPC tracing * Unify this with Hadoop RPC * Online rolling upgrade of single cluster between major versions: Today, major version upgrades of a single cluster require downtime to upgrade all services in lockstep, while some minor versions updates can be upgraded via the rolling-restart script. HBase should remain available through this process. @@ -137, +132 @@ === References === ---- - Dapper: http://research.google.com/pubs/pub36356.html + . Dapper: http://research.google.com/pubs/pub36356.html - Cross version upgrade and compatibility: https://issues.apache.org/jira/browse/HBASE-5305 + . Cross version upgrade and compatibility: https://issues.apache.org/jira/browse/HBASE-5305 - Redo IPC/RPC: https://issues.apache.org/jira/browse/HBASE-2182 + . Redo IPC/RPC: https://issues.apache.org/jira/browse/HBASE-2182 - HDFS wire compatibility: [[https://issues.apache.org/jira/browse/HADOOP-7347|HADOOP-7347]] + . HDFS wire compatibility: [[https://issues.apache.org/jira/browse/HADOOP-7347|HADOOP-7347]] - HDFS client wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2060|HDFS-2060]] + . HDFS client wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2060|HDFS-2060]] - HDFS data protocol wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2058|HDFS-2058]] + . HDFS data protocol wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2058|HDFS-2058]] - Use protobuf objects in existing IPC: [[https://issues.apache.org/jira/browse/HADOOP-7379|HADOOP-7379]] + . Use protobuf objects in existing IPC: [[https://issues.apache.org/jira/browse/HADOOP-7379|HADOOP-7379]]