[
https://issues.apache.org/jira/browse/HADOOP-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540927
]
stack commented on HADOOP-2161:
-------------------------------
I'm currently looking at this too. One issue I've just found is that rows are
lexicographically sorted but rows in PE are numbers (This makes it so 4500 <
46). The MapFiles are messed up. Am going to zero-pad to see what that does
(By the way, thanks for digging in here and finding non-breaking after we'd
left the row).
> getRow() is orders of magnitudes slower than get(), even on rows with one
> column
> --------------------------------------------------------------------------------
>
> Key: HADOOP-2161
> URL: https://issues.apache.org/jira/browse/HADOOP-2161
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.16.0
> Environment: latest from trunk
> Reporter: Clint Morgan
> Attachments: HADOOP-2161-2.patch, PerformanceEvaluation-patch.txt
>
>
> HTable.getRow(Text) is several orders of magnitude slower than
> HTable.get(Text, Text), even on rows with a single column.
> This problem can be observed by the attached patch of
> PerformanceEvaluation.java which changes SequentialRead to use getRow,
> and prints out the time for each read.
> The test can the be run with:
> bin/hbase org.apache.hadoop.hbase.PerformaeEvaluation sequentialRead 1
> On my laptop, the original test (using get()) produces reads on the order of
> 5-20
> milliseconds. Using getRow(), the reads take 50-2000 ms.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.