[
https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676457#action_12676457
]
Jean-Daniel Cryans commented on HBASE-61:
-----------------------------------------
Some tests I did:
Unit tests on my Ubuntu desktop:
{code}
BUILD SUCCESSFUL
Total time: 25 minutes 39 seconds
{code}
11 nodes cluster (2.0GHz CPU, 1GB RAM, 2*80GB HDD JBOD PATA)
PE ran from the Master node:
HFile
{code}
Finished sequentialWrite in 484020ms at offset 0 for 1048576 rows 2166 rows/sec
HBase is restarted and I waited for all compactions to occur
Finished scan in 166626ms at offset 0 for 1048576 rows 6293 rows/sec
Finished randomRead in 2711788ms at offset 0 for 1048576 rows 387 rows/sec
{code}
MapFile
{code}
Finished sequentialWrite in 496937ms at offset 0 for 1048576 rows 2110 rows/sec
HBase is restarted and I waited for all compactions to occur
Finished scan in 153011ms at offset 0 for 1048576 rows 6853 rows/sec
Finished randomRead in 4270211ms at offset 0 for 1048576 rows 246 rows/sec
{code}
So, on this setup, reads are way up, writes are a bit up and scans are a tiny
bit down. IMO this is good for a commit if issues stated by Stack are addressed
in other jiras.
+1
> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>
> Key: HBASE-61
> URL: https://issues.apache.org/jira/browse/HBASE-61
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: io
> Reporter: Bryan Duxbury
> Assignee: ryan rawson
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: cpucalltreetfile.html, HBASE-83.patch, hfile.patch,
> hfile2.patch, hfile3.patch, longestkey.patch, tfile.patch, tfile3.patch
>
>
> Today, HBase uses the Hadoop MapFile class to store data persistently to
> disk. This is convenient, as it's already done (and maintained by other
> people :). However, it's beginning to look like there might be possible
> performance benefits to be had from doing an HBase-specific implementation of
> MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features
> might be included in such an implementation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.