Rahul Kumar created PHOENIX-7799:
------------------------------------
Summary: Coalesce splits by region server to avoid hotspotting
from concurrent mappers
Key: PHOENIX-7799
URL: https://issues.apache.org/jira/browse/PHOENIX-7799
Project: Phoenix
Issue Type: Sub-task
Affects Versions: 5.1.3, 5.2.0
Reporter: Rahul Kumar
PhoenixSyncTableTool creates one MapReduce InputSplit per HBase region, causing
each split to spawn its own mapper task. When multiple regions reside on the
same RegionServer, concurrent mappers hit the same server simultaneously,
leading to hotspotting and resource contention (noisy neighbor problem). This
degrades performance and can cause timeouts.
Implement locality-aware split coalescing that groups all InputSplits from the
same RegionServer into a single coalesced split. This ensures only one mapper
processes regions from each server sequentially, eliminating concurrent
requests and hotspotting. The feature will be controlled by the configuration
property phoenix.sync.table.split.coalescing (default: false). For a table with
100 regions across 5 RegionServers, this reduces mapper count from 100 to 5,
eliminating server-side contention while maintaining data locality.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)