Rahul Kumar created PHOENIX-7799:
------------------------------------

             Summary: Coalesce splits by region server to avoid hotspotting 
from concurrent mappers
                 Key: PHOENIX-7799
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7799
             Project: Phoenix
          Issue Type: Sub-task
    Affects Versions: 5.1.3, 5.2.0
            Reporter: Rahul Kumar


PhoenixSyncTableTool creates one MapReduce InputSplit per HBase region, causing 
each split to spawn its own mapper task. When multiple regions reside on the 
same RegionServer, concurrent mappers hit the same server simultaneously, 
leading to hotspotting and resource contention (noisy neighbor problem). This 
degrades performance and can cause timeouts.

Implement locality-aware split coalescing that groups all InputSplits from the 
same RegionServer into a single coalesced split. This ensures only one mapper 
processes regions from each server sequentially, eliminating concurrent 
requests and hotspotting. The feature will be controlled by the configuration 
property phoenix.sync.table.split.coalescing (default: false). For a table with 
100 regions across 5 RegionServers, this reduces mapper count from 100 to 5, 
eliminating server-side contention while maintaining data locality.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to