[ https://issues.apache.org/jira/browse/HADOOP-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540922 ]
Clint Morgan commented on HADOOP-2161: -------------------------------------- I am having trouble getting the problem I raised in the last comment to occur in the stock PerformanceEvaluation. However, if I drop the number of rows in the test down an order of magnitude, by changing line 82 to private static final int ONE_GB = 1024 * 1024 * 100; I can see the problem. It happens around the 50,000th read, the read times begin to gradually rise from 5ms to 200ms by the time we reach the 60,000th read. If I do a trace during this period, I see the abnormal behavior with getClosest() and next() that I described in the above comment. So something fishy is still going on. Possibly in HStoreFile.HalfMapFileReader related to having two mapfiles. > getRow() is orders of magnitudes slower than get(), even on rows with one > column > -------------------------------------------------------------------------------- > > Key: HADOOP-2161 > URL: https://issues.apache.org/jira/browse/HADOOP-2161 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Affects Versions: 0.16.0 > Environment: latest from trunk > Reporter: Clint Morgan > Attachments: HADOOP-2161-2.patch, PerformanceEvaluation-patch.txt > > > HTable.getRow(Text) is several orders of magnitude slower than > HTable.get(Text, Text), even on rows with a single column. > This problem can be observed by the attached patch of > PerformanceEvaluation.java which changes SequentialRead to use getRow, > and prints out the time for each read. > The test can the be run with: > bin/hbase org.apache.hadoop.hbase.PerformaeEvaluation sequentialRead 1 > On my laptop, the original test (using get()) produces reads on the order of > 5-20 > milliseconds. Using getRow(), the reads take 50-2000 ms. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.