[ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654557#comment-14654557 ]
Thomas D'Silva commented on PHOENIX-1609: ----------------------------------------- [~maghamraviki...@gmail.com] [~jamestaylor] I am trying to compare the performance of the map reduce index build vs the regular UPSERT SELECT based index build. One a 1 billion row table with 19 columns the regular index build takes 8.5 hours compared to the map reduce index build which takes ~23 hours. Do you know if there are any special config settings I could use to speed up the MR index build ? > MR job to populate index tables > -------------------------------- > > Key: PHOENIX-1609 > URL: https://issues.apache.org/jira/browse/PHOENIX-1609 > Project: Phoenix > Issue Type: New Feature > Reporter: maghamravikiran > Assignee: maghamravikiran > Fix For: 5.0.0, 4.4.0 > > Attachments: 0001-PHOENIX-1609-4.0.patch, > 0001-PHOENIX-1609-4.0.patch, 0001-PHOENIX-1609-wip.patch, > 0001-PHOENIX_1609.patch, PHOENIX-1609-master.patch > > > Often, we need to create new indexes on master tables way after the data > exists on the master tables. It would be good to have a simple MR job given > by the phoenix code that users can call to have indexes in sync with the > master table. > Users can invoke the MR job using the following command > hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt > INDEX_TABLE -columns a,b,c > Is this ideal? -- This message was sent by Atlassian JIRA (v6.3.4#6332)