My main thought is that a migration like this should use bulk loading,
which should be relatively easy given you already use MR
(HFileOutputFormat2). It doesn't solve the region-splitting problem. With
Admin.splitRegion, you can specify a split point. You can use that to
iteratively add a bunch of regions wherever you need them in the keyspace.
Yes, it's 2 at a time, but it should still be quick enough in the grand
scheme of a large migration. Adding a splitRegion method that takes
byte[][] for multiple split points would be a nice UX improvement, but not
strictly necessary.

On Fri, Jan 26, 2024 at 12:10 PM Rushabh Shah
<rushabh.s...@salesforce.com.invalid> wrote:

> Hi Everyone,
> At my workplace, we use HBase + Phoenix to run our customer workloads. Most
> of our phoenix tables are multi-tenant and we store the tenantID as the
> leading part of the rowkey. Each tenant belongs to only 1 hbase cluster.
> Due to capacity planning, hardware refresh cycles and most recently move to
> public cloud initiatives, we have to migrate a tenant from one hbase
> cluster (source cluster) to another hbase cluster (target cluster).
> Normally we migrate a lot of tenants (in 10s of thousands) at a time and
> hence we have to copy a huge amount of data (in TBs) from multiple source
> clusters to a single target cluster. We have our internal tool which uses
> MapReduce framework to copy the data. Since all of these tenants don’t have
> any presence on the target cluster (Note that the table is NOT empty since
> we have data for other tenants in the target cluster), they start with one
> region and due to an organic split process, the data gets distributed among
> different regions and different regionservers. But the organic splitting
> process takes a lot of time and due to the distributed nature of the MR
> framework, it causes hotspotting issues on the target cluster which often
> lasts for days. This causes availability issues where the CPU is saturated
> and/or disk saturation on the regionservers ingesting the data. Also this
> causes a lot of replication related alerts (Age of last ship, LogQueue
> size) which goes on for days.
>
> In order to handle the huge influx of data, we should ideally pre-split the
> table on the target based on the split structure present on the source
> cluster. If we pre-split and create empty regions with right region
> boundaries it will help to distribute the load to different regions and
> region servers and will prevent hotspotting.
>
> Problems with the above approach:
> 1. Currently we allow pre splitting only while creating a new table. But in
> our production env, we already have the table created for other tenants. So
> we would like to pre-split an existing table for new tenants.
> 2. Currently we split a given region into just 2 daughter regions. But if
> we have the split points information from the source cluster and if the
> data for the to-be-migrated tenant is split across 100 regions on the
> source side, we would ideally like to create 100 empty regions on the
> target cluster.
>
> Trying to get early feedback from the community. Do you all think this is a
> good idea? Open to other suggestions also.
>
>
> Thank you,
> Rushabh.
>

Reply via email to