If you are paying for CDH then just upgrade via cloudera manager. If you are not paying for it then I think you will find it a huge problem.
Upgade may have to be done using a version 6 then a newer version to get to a suitable Hbase/Hadoop version. We are currently on CDH6.3.2 but the Hbase is an extremely useless version (2.1.0) and we are not in the business of generating income from the data so cannot justify the exorbitant cost per node that cloudera are asking for later versions. -----Original Message----- From: Bryan Beaudreault <[email protected]> Sent: Wednesday, May 19, 2021 2:49 PM To: [email protected] Subject: Upgrading cdh5.16.2 to apache hbase 2.4 using replication EXTERNAL We are running about 40 HBase clusters, with over 5000 regionservers total. These are all running cdh5.16.2. We also have thousands of clients (from APIs to kafka workers to hadoop jobs, etc) hitting these various clusters, also running cdh5.16.2. We are starting to plan an upgrade to hbase 2.x and hadoop 3.x. I've read through the docs on https://hbase.apache.org/book.html#_upgrade_paths, and am starting to plan our approach. More than a few seconds of downtime is not an option, but rolling upgrade also seems risky (if not impossible for our version). One thought I had is whether replication is compatible between these two versions. If so, we probably would consider swapping onto upgraded clusters using backup/restore + replication. If we were to go this route we'd probably want to consider bi-directional replication so that we can roll back to the old cluster if there's a regression. Does anyone have any experience with this approach? Is replication protocol compatible across the seversions? Any concerns, tips or other considerations to keep in mind? We do the backup/restore + replication approach pretty regularly to move tables between clusters. Thanks!
