[ https://issues.apache.org/jira/browse/HADOOP-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554455 ]
stack commented on HADOOP-2488: ------------------------------- Hmm... seems like checksum sizes are default of 512 in 0.15 branch so that ain't it. I did a clean checkout of trunk and spent more time on timings and my original claim that TRUNK is 3times slower is incorrect: its *just* ~40% slower. 0.15 branch {code} [EMAIL PROTECTED] hbase]$ for i in 1 2 3 ; do ./bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation mapfile 1; done 07/12/26 19:08:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 19:08:11 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 19:08:36 INFO hbase.PerformanceEvaluation: Writing 100000 records took 24159ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 19:08:36 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 19:08:52 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 19:09:05 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 19:09:17 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 19:09:30 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 19:09:42 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 19:09:55 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 19:10:08 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 19:10:20 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 19:10:33 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 19:10:45 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 129836ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 19:10:45 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 19:10:47 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1717ms 07/12/26 19:10:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 19:10:48 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 19:11:12 INFO hbase.PerformanceEvaluation: Writing 100000 records took 23859ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 19:11:12 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 19:11:25 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 19:11:39 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 19:11:51 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 19:12:03 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 19:12:16 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 19:12:29 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 19:12:41 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 19:12:54 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 19:13:06 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 19:13:19 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 126779ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 19:13:19 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 19:13:20 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1769ms 07/12/26 19:13:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 19:13:21 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 19:13:45 INFO hbase.PerformanceEvaluation: Writing 100000 records took 23679ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 19:13:45 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 19:13:58 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 19:14:11 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 19:14:23 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 19:14:36 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 19:14:48 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 19:15:01 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 19:15:13 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 19:15:26 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 19:15:38 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 19:15:51 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 125860ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 19:15:51 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 19:15:52 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1836ms {code} TRUNK {code} [EMAIL PROTECTED] hbase]$ for i in 1 2 3 ; do ./bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation mapfile 1; done 07/12/26 18:50:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 18:50:17 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 18:50:41 INFO hbase.PerformanceEvaluation: Writing 100000 records took 24191ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 18:50:41 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 18:51:02 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 18:51:20 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 18:51:38 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 18:51:55 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 18:52:13 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 18:52:29 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 18:52:47 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 18:53:04 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 18:53:21 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 18:53:39 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 177692ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 18:53:39 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 18:53:41 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1749ms 07/12/26 18:53:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 18:53:42 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 18:54:06 INFO hbase.PerformanceEvaluation: Writing 100000 records took 23873ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 18:54:06 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 18:54:25 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 18:54:44 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 18:55:01 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 18:55:17 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 18:55:33 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 18:55:51 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 18:56:08 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 18:56:25 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 18:56:41 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 18:56:57 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 171167ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 18:56:57 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 18:56:59 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1789ms 07/12/26 18:57:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 07/12/26 18:57:00 INFO hbase.PerformanceEvaluation: Writing 100000 rows to hdfs://X.X.X:9123/user/?/performanceevaluation.mapfile 07/12/26 18:57:25 INFO hbase.PerformanceEvaluation: Writing 100000 records took 24076ms (Note: generation of keys and values is done inline and has been seen to consume significant time: e.g. ~30% of cpu time 07/12/26 18:57:25 INFO hbase.PerformanceEvaluation: Reading 100000 random rows 07/12/26 18:57:43 INFO hbase.PerformanceEvaluation: Read 10000 07/12/26 18:58:01 INFO hbase.PerformanceEvaluation: Read 20000 07/12/26 18:58:18 INFO hbase.PerformanceEvaluation: Read 30000 07/12/26 18:58:35 INFO hbase.PerformanceEvaluation: Read 40000 07/12/26 18:58:52 INFO hbase.PerformanceEvaluation: Read 50000 07/12/26 18:59:09 INFO hbase.PerformanceEvaluation: Read 60000 07/12/26 18:59:26 INFO hbase.PerformanceEvaluation: Read 70000 07/12/26 18:59:42 INFO hbase.PerformanceEvaluation: Read 80000 07/12/26 19:00:00 INFO hbase.PerformanceEvaluation: Read 90000 07/12/26 19:00:16 INFO hbase.PerformanceEvaluation: Reading 100000 random records took 171260ms (Note: generation of random key is done in line and takes a significant amount of cpu time: e.g 10-15% 07/12/26 19:00:16 INFO hbase.PerformanceEvaluation: Reading 100000 rows sequentially 07/12/26 19:00:18 INFO hbase.PerformanceEvaluation: Reading 100000 records serially took 1764ms {code} > Random reads in mapfile are 3times slower in TRUNK than they are in 0.15.x > -------------------------------------------------------------------------- > > Key: HADOOP-2488 > URL: https://issues.apache.org/jira/browse/HADOOP-2488 > Project: Hadoop > Issue Type: Bug > Reporter: stack > > Opening a mapfile on hdfs and then doing 100k random reads inside the open > file takes 3 times longer to complete in TRUNK than same test run on 0.15.x. > Random read performance is important to hbase. > Serial reads and writes perform about the same in TRUNK and 0.15.x. > Below are 3 runs done against 0.15 of the mapfile test that can be found in > hbase followed by 2 runs done against TRUNK: > 0.15 branch > {code} > [EMAIL PROTECTED] hbase]$ for i in 1 2 3; do ./bin/hbase > org.apache.hadoop.hbase.PerformanceEvaluation mapfile 1; done > 07/12/24 18:34:50 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 18:34:50 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXX:9123/user/?/performanceevaluation.mapfile > 07/12/24 18:35:13 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 23644ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 18:35:13 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > .... > 07/12/24 18:37:23 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 129879ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 18:37:23 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 18:37:25 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1832ms > 07/12/24 18:37:26 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 18:37:26 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXXX:9123/user/?/performanceevaluation.mapfile > 07/12/24 18:37:50 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 24188ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 18:37:50 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > ... > 07/12/24 18:39:58 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 127879ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 18:39:58 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 18:40:00 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1787ms > 07/12/24 18:40:01 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 18:40:01 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXX:9123/user/?/performanceevaluation.mapfile > 07/12/24 18:40:24 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 23832ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 18:40:24 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > .. > 07/12/24 18:42:31 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 126954ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 18:42:31 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 18:42:33 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1766ms > 07/12/24 17:24:25 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 17:24:25 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXXX:9123/user/?/performanceevaluation.mapfile > 07/12/24 17:24:49 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 24181ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 17:24:49 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > ... > 07/12/24 17:26:59 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 129564ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 17:26:59 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 17:27:00 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1793ms > {code} > TRUNK > {code} > [EMAIL PROTECTED] hbase]$ ./bin/hbase > org.apache.hadoop.hbase.PerformanceEvaluation mapfile 1 > 07/12/24 18:02:26 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 18:02:26 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXX3/user/?/performanceevaluation.mapfile > 07/12/24 18:02:51 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 25500ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 18:02:51 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > .... > 07/12/24 18:11:11 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 500306ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 18:11:11 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 18:11:13 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1940ms > [EMAIL PROTECTED] hbase]$ ./bin/hbase > org.apache.hadoop.hbase.PerformanceEvaluation mapfile 1 > 07/12/24 18:12:46 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 07/12/24 18:12:46 INFO hbase.PerformanceEvaluation: Writing 100000 rows to > hdfs://XXXX/user/?/performanceevaluation.mapfile > 07/12/24 18:13:12 INFO hbase.PerformanceEvaluation: Writing 100000 records > took 25593ms (Note: generation of keys and values is done inline and has been > seen to consume significant time: e.g. ~30% of cpu time > 07/12/24 18:13:12 INFO hbase.PerformanceEvaluation: Reading 100000 random rows > ... > 07/12/24 18:22:16 INFO hbase.PerformanceEvaluation: Reading 100000 random > records took 543992ms (Note: generation of random key is done in line and > takes a significant amount of cpu time: e.g 10-15% > 07/12/24 18:22:16 INFO hbase.PerformanceEvaluation: Reading 100000 rows > sequentially > 07/12/24 18:22:18 INFO hbase.PerformanceEvaluation: Reading 100000 records > serially took 1786ms > {code} > Above was done on a small hdfs cluster of 4 machines with each a dfs node. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.