[ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326252#comment-14326252 ]
Lars Hofhansl commented on PHOENIX-1609: ---------------------------------------- Yeah, most of the blocks are in place! I would start with not having Phoenix trigger the M/R or Spark job. That would require additional (tricky?) setup, and one might not realize one needs that until an index is created. Of course in the long run it would be *far* more convenient if Phoenix did it all automatically. Using M/R is only one way to to seed an index. Folks might want to write all kinds of jobs to seed an index (maybe even from external data). Maybe we can later add a location of a script (or a jar as was suggested above) to the index creation command. Failure handling would be tricky, I suppose. So it seems the only thing that is really missing is creating an index in an unfinished way, and let an external tools finish the job asynchronously. > MR job to populate index tables > -------------------------------- > > Key: PHOENIX-1609 > URL: https://issues.apache.org/jira/browse/PHOENIX-1609 > Project: Phoenix > Issue Type: New Feature > Reporter: maghamravikiran > Assignee: maghamravikiran > Attachments: 0001-PHOENIX_1609.patch > > > Often, we need to create new indexes on master tables way after the data > exists on the master tables. It would be good to have a simple MR job given > by the phoenix code that users can call to have indexes in sync with the > master table. > Users can invoke the MR job using the following command > hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt > INDEX_TABLE -columns a,b,c > Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)