[ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343350#comment-14343350 ]
Gabriel Reid commented on PHOENIX-1609: --------------------------------------- Patch looks pretty good to me, just a few pretty minor things I noticed: * It looks like CsvToKeyValueMapper#loadPreUpsertProcessor and PhoenixConfigurationUtil#loadPreUpsertProcessor are copies of each other, so that can be reduced to a single implementation * I noticed that the string separator in the ColumnInfo class is changed -- just curious, why is that? * There appear to be two nearly identical copies of QueryUtil#constructUpsertStatement, although one takes a hint parameter. I think the non-hint version could just delegate to the version with a hint, and that way we can reduce code duplication * The number of reducers is unnecessarily set to 0 in IndexTool -- this can be removed. It'll be overwritten by the HBase setup of the job anyhow, but having that call there to explicitly set the number of reducers to 0 gives the impression that it's supposed to be doing something * There are some long option names in IndexTool that contain spaces (e.g. Data table, Index Table). These parameters are meant to be supplied using the --long-parameter-name notation, so I'm not sure what will happen when they contain spaces, but I don't think it'll be good. These should probably be data-table, index-table, etc. > MR job to populate index tables > -------------------------------- > > Key: PHOENIX-1609 > URL: https://issues.apache.org/jira/browse/PHOENIX-1609 > Project: Phoenix > Issue Type: New Feature > Reporter: maghamravikiran > Assignee: maghamravikiran > Attachments: 0001-PHOENIX-1609-4.0.patch, > 0001-PHOENIX-1609-wip.patch, 0001-PHOENIX_1609.patch > > > Often, we need to create new indexes on master tables way after the data > exists on the master tables. It would be good to have a simple MR job given > by the phoenix code that users can call to have indexes in sync with the > master table. > Users can invoke the MR job using the following command > hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt > INDEX_TABLE -columns a,b,c > Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)