[ https://issues.apache.org/jira/browse/PHOENIX-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342846#comment-15342846 ]
Josh Elser commented on PHOENIX-2209: ------------------------------------- Pinged Rajesh in private chat to let him know that there's a compilation issue on 4.x-HBase-0.98. He's looking at it. > Building Local Index Asynchronously via IndexTool fails to populate index > table > ------------------------------------------------------------------------------- > > Key: PHOENIX-2209 > URL: https://issues.apache.org/jira/browse/PHOENIX-2209 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.5.0 > Environment: CDH: 5.4.4 > HBase: 1.0.0 > Phoenix: 4.5.0 (https://github.com/SiftScience/phoenix/tree/4.5-HBase-1.0) > with hacks for CDH compatibility. > Reporter: Keren Gu > Assignee: Rajeshbabu Chintaguntla > Labels: IndexTool, LocalIndex, index > Fix For: 4.8.0 > > Attachments: PHOENIX-2209.patch, PHOENIX-2209_v2.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Using the Asynchronous Index population tool to create local index (of 1 > column) on tables with 10 columns, and 65M, 250M, 340M, and 1.3B rows > respectively. > Table Schema as follows (with generic column names): > {quote} > CREATE TABLE PH_SOJU_SHORT ( > id INT PRIMARY KEY, > c2 VARCHAR NULL, > c3 VARCHAR NULL, > c4 VARCHAR NULL, > c5 VARCHAR NULL, > c6 VARCHAR NULL, > c7 DOUBLE NULL, > c8 VARCHAR NULL, > c9 VARCHAR NULL, > c10 BIGINT NULL > ) > {quote} > Example command used (for 65M row table): > {quote} > 0: jdbc:phoenix:localhost> create local index LC_INDEX_SOJU_EVAL_FN on > PH_SOJU_SHORT(C4) async; > {quote} > And MR job started with command: > {quote} > $ hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table > PH_SOJU_SHORT --index-table LC_INDEX_SOJU_EVAL_FN --output-path > LC_INDEX_SOJU_EVAL_FN_HFILE > {quote} > The IndexTool MR jobs finished in 18min, 77min, 77min, and 2hr 34min > respectively, but all index tables where empty. > For the table with 65M rows, IndexTool had 12 mappers and reducers. MR > Counters show Map input and output records = 65M, Reduce Input and output > records = 65M. PhoenixJobCounters input and output records are all 65M. > IndexTool Reducer Log tail: > {quote} > ... > 2015-08-25 00:26:44,687 INFO [main] org.apache.hadoop.mapred.Merger: Down to > the last merge-pass, with 32 segments left of total size: 22805636866 bytes > 2015-08-25 00:26:44,693 INFO [main] > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output > Committer Algorithm version is 1 > 2015-08-25 00:26:44,765 INFO [main] > org.apache.hadoop.conf.Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 2015-08-25 00:26:44,908 INFO [main] > org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is > deprecated. Instead, use mapreduce.job.skiprecords > 2015-08-25 00:26:45,060 INFO [main] > org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled > 2015-08-25 00:36:43,880 INFO [main] > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2: > Writer=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/_temporary/attempt_1440094483400_5974_r_000000_0/0/496b926ad624438fa08626ac213d0f92, > wrote=10737418236 > 2015-08-25 00:36:45,967 INFO [main] > org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled > 2015-08-25 00:38:43,095 INFO [main] org.apache.hadoop.mapred.Task: > Task:attempt_1440094483400_5974_r_000000_0 is done. And is in the process of > committing > 2015-08-25 00:38:43,123 INFO [main] org.apache.hadoop.mapred.Task: Task > attempt_1440094483400_5974_r_000000_0 is allowed to commit now > 2015-08-25 00:38:43,132 INFO [main] > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of > task 'attempt_1440094483400_5974_r_000000_0' to > hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/task_1440094483400_5974_r_000000 > 2015-08-25 00:38:43,158 INFO [main] org.apache.hadoop.mapred.Task: Task > 'attempt_1440094483400_5974_r_000000_0' done. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)