[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720593#comment-14720593
 ] 

James Taylor commented on PHOENIX-2154:
---------------------------------------

If we don't write the dummy key value in the mapper.cleanup() method, what 
happens? Will the reducer run over all the KeyValues we generated during the 
map phase? Does writing that dummy key value prevent this? Seems somewhat 
weird, but I'm +1 if it works (and is necessary).

[~tdsilva] - would you mind reviewing too?

> Failure of one mapper should not affect other mappers in MR index build
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-2154
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2154
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Ravi Kishore Valeti
>         Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to