Hi Everyone, At my workplace, we use HBase + Phoenix to run our customer workloads. Most of our phoenix tables are multi-tenant and we store the tenantID as the leading part of the rowkey. Each tenant belongs to only 1 hbase cluster. Due to capacity planning, hardware refresh cycles and most recently move to public cloud initiatives, we have to migrate a tenant from one hbase cluster (source cluster) to another hbase cluster (target cluster). Normally we migrate a lot of tenants (in 10s of thousands) at a time and hence we have to copy a huge amount of data (in TBs) from multiple source clusters to a single target cluster. We have our internal tool which uses MapReduce framework to copy the data. Since all of these tenants don’t have any presence on the target cluster (Note that the table is NOT empty since we have data for other tenants in the target cluster), they start with one region and due to an organic split process, the data gets distributed among different regions and different regionservers. But the organic splitting process takes a lot of time and due to the distributed nature of the MR framework, it causes hotspotting issues on the target cluster which often lasts for days. This causes availability issues where the CPU is saturated and/or disk saturation on the regionservers ingesting the data. Also this causes a lot of replication related alerts (Age of last ship, LogQueue size) which goes on for days.
In order to handle the huge influx of data, we should ideally pre-split the table on the target based on the split structure present on the source cluster. If we pre-split and create empty regions with right region boundaries it will help to distribute the load to different regions and region servers and will prevent hotspotting. Problems with the above approach: 1. Currently we allow pre splitting only while creating a new table. But in our production env, we already have the table created for other tenants. So we would like to pre-split an existing table for new tenants. 2. Currently we split a given region into just 2 daughter regions. But if we have the split points information from the source cluster and if the data for the to-be-migrated tenant is split across 100 regions on the source side, we would ideally like to create 100 empty regions on the target cluster. Trying to get early feedback from the community. Do you all think this is a good idea? Open to other suggestions also. Thank you, Rushabh.