Vincent Poon created PHOENIX-4704:
-------------------------------------

             Summary: Presplit index tables when building asynchronously
                 Key: PHOENIX-4704
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4704
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Vincent Poon


For large data tables with many regions, if we build the index asynchronously 
using the IndexTool, the index table will initial face a hotspot as all data 
region mappers attempt to write to the sole new index region.  This can 
potentially lead to the index getting disabled if writes to the index table 
timeout during this hotspotting.

We can add an optional step (or perhaps activate it based on the count of 
regions in the data table) to the IndexTool to first do a MR job to gather 
stats on the indexed column values, and then attempt to presplit the index 
table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to