[jira] [Commented] (PHOENIX-1609) MR job to populate index tables

Jan Fernando (JIRA) Wed, 18 Feb 2015 09:22:15 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326206#comment-14326206
 ]


Jan Fernando commented on PHOENIX-1609:
---------------------------------------

+1 on the ASYNC keyword. For our use cases we have a good sense of data size at 
the outset and want to be explicit about which approach we are taking.

This is important as it lets us plan both the feature and the rollout based on 
one index creation path. If the index build fails it fails and we need to plan 
for and handle that (i.e. how our feature behaves, how we retry etc.).  I 
wouldn't want index creation automatically falling back to the current approach 
if the async M/R approach failed - this could create unexpected surprises on 
the cluster. I'd rather it be a bit more manual as this makes it easier to 
manage and reason about different failure scenarios.

> MR job to populate index tables 
> --------------------------------
>
>                 Key: PHOENIX-1609
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1609
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: maghamravikiran
>            Assignee: maghamravikiran
>         Attachments: 0001-PHOENIX_1609.patch
>
>
> Often, we need to create new indexes on master tables way after the data 
> exists on the master tables.  It would be good to have a simple MR job given 
> by the phoenix code that users can call to have indexes in sync with the 
> master table. 
> Users can invoke the MR job using the following command 
> hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt 
> INDEX_TABLE -columns a,b,c
> Is this ideal? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-1609) MR job to populate index tables

Reply via email to