[ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323633#comment-14323633 ]
Serhiy Bilousov commented on PHOENIX-1609: ------------------------------------------ I do understand why using MR job be particularly beneficiary especially on big datasets but would it make PHOENIX to require MR installed and configured (in other words depends on MR) ? In our case we do not use MR at all and would like to stay such way. My 1c would be at least make it optional so for cases like ours we would not have to bring MR to our cluster. It also would be much more attractive if it possible to may be just specify jar to run to build such index (so it can be spark (on YARN) for example? (just a thought) > MR job to populate index tables > -------------------------------- > > Key: PHOENIX-1609 > URL: https://issues.apache.org/jira/browse/PHOENIX-1609 > Project: Phoenix > Issue Type: New Feature > Reporter: maghamravikiran > Assignee: maghamravikiran > Attachments: 0001-PHOENIX_1609.patch > > > Often, we need to create new indexes on master tables way after the data > exists on the master tables. It would be good to have a simple MR job given > by the phoenix code that users can call to have indexes in sync with the > master table. > Users can invoke the MR job using the following command > hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt > INDEX_TABLE -columns a,b,c > Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)