[
https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343350#comment-14343350
]
Gabriel Reid commented on PHOENIX-1609:
---------------------------------------
Patch looks pretty good to me, just a few pretty minor things I noticed:
* It looks like CsvToKeyValueMapper#loadPreUpsertProcessor and
PhoenixConfigurationUtil#loadPreUpsertProcessor are copies of each other, so
that can be reduced to a single implementation
* I noticed that the string separator in the ColumnInfo class is changed --
just curious, why is that?
* There appear to be two nearly identical copies of
QueryUtil#constructUpsertStatement, although one takes a hint parameter. I
think the non-hint version could just delegate to the version with a hint, and
that way we can reduce code duplication
* The number of reducers is unnecessarily set to 0 in IndexTool -- this can be
removed. It'll be overwritten by the HBase setup of the job anyhow, but having
that call there to explicitly set the number of reducers to 0 gives the
impression that it's supposed to be doing something
* There are some long option names in IndexTool that contain spaces (e.g. Data
table, Index Table). These parameters are meant to be supplied using the
--long-parameter-name notation, so I'm not sure what will happen when they
contain spaces, but I don't think it'll be good. These should probably be
data-table, index-table, etc.
> MR job to populate index tables
> --------------------------------
>
> Key: PHOENIX-1609
> URL: https://issues.apache.org/jira/browse/PHOENIX-1609
> Project: Phoenix
> Issue Type: New Feature
> Reporter: maghamravikiran
> Assignee: maghamravikiran
> Attachments: 0001-PHOENIX-1609-4.0.patch,
> 0001-PHOENIX-1609-wip.patch, 0001-PHOENIX_1609.patch
>
>
> Often, we need to create new indexes on master tables way after the data
> exists on the master tables. It would be good to have a simple MR job given
> by the phoenix code that users can call to have indexes in sync with the
> master table.
> Users can invoke the MR job using the following command
> hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt
> INDEX_TABLE -columns a,b,c
> Is this ideal?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)