[ https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717359#comment-14717359 ]
James Taylor commented on PHOENIX-2154: --------------------------------------- Nice work, [~rvaleti]. One question: when is the call to TableRecordWriter.close(TaskAttemptContext context) made? After all mappers have completed or after each mapper completes? If the former then we're good. Regarding the need for a run-foreground I'm -0 on it. It's easy enough to have a while sleep loop that checks if the index state has been updated to active. I just want to make sure our unit tests test the real code path that would be used in production. For IndexToolIT.testSecondaryIndex(), we should check that the secondary index is valid wrt to the data table. [~tdsilva] has some code that does that - you can basically join the index to the data table and confirm that all the values match. > Failure of one mapper should not affect other mappers in MR index build > ----------------------------------------------------------------------- > > Key: PHOENIX-2154 > URL: https://issues.apache.org/jira/browse/PHOENIX-2154 > Project: Phoenix > Issue Type: Bug > Reporter: James Taylor > Assignee: Ravi Kishore Valeti > Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, > PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, > PHOENIX-2154-_HBase_Frontdoor_API_v1.patch > > > Once a mapper in the MR index job succeeds, it should not need to be re-done > in the event of the failure of one of the other mappers. The initial > population of an index is based on a snapshot in time, so new rows getting > *after* the index build has started and/or failed do not impact it. > Also, there's a 1:1 correspondence between index rows and table rows, so > there's really no need to dedup. However, the index rows will have a > different row key than the data table, so I'm not sure how the HFiles are > split. Will they potentially overlap and is this an issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)