Geoffrey Jacoby created PHOENIX-5027:
----------------------------------------

             Summary: PhoenixIndexImportDirectMapper retried mappers can 
succeed without inserting all index data
                 Key: PHOENIX-5027
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5027
             Project: Phoenix
          Issue Type: Bug
            Reporter: Geoffrey Jacoby


On two recent occasions I've rebuilt a large global immutable index by doing a 
DROP/CREATE and ended up with missing index data, though it doesn't happen 
every time. Here's what happened:

1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, and 
invokes IndexTool.
2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
3. Some mappers fail because of timeouts due to heavy splitting on the new 
index table
4. Those mappers are retried and succeed. The MR job as a whole completes 
successfully.
5. RowCounter and IndexScrutinyTool show millions of rows are missing from the 
index, with keys that imply they were part of the failed mappers

Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in 
PhoenixIndexImportDirectMapper _looks_ idempotent on a rerun, so I've been 
struggling to find the cause of the missing index data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to