[ https://issues.apache.org/jira/browse/HBASE-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002364#comment-13002364 ]
ryan rawson commented on HBASE-3514: ------------------------------------ a test failure caused me to dig into this, and there are 2 problems: - the handling of the ByteBuffer causes us to cache bad data. (easy fix) - the resulting buffer is not being trimmed, which is causing us to over use ram on block caching. The latter one IS a big deal, it means that our block cache can expect to be 25% larger than necessary, which is NOT good. We should consider trimming the resulting buffer, but this involves a array copy possibly removing all optimizations added by this patch! > Speedup HFile.Writer append > --------------------------- > > Key: HBASE-3514 > URL: https://issues.apache.org/jira/browse/HBASE-3514 > Project: HBase > Issue Type: Improvement > Components: io > Affects Versions: 0.90.0 > Reporter: Matteo Bertozzi > Priority: Minor > Attachments: HBASE-3514-append-0.90-v2.patch, > HBASE-3514-append-0.90-v2b.patch, HBASE-3514-append-0.90-v3.patch, > HBASE-3514-append-0.90.patch, HBASE-3514-append-trunk-v2.patch, > HBASE-3514-append-trunk-v2b.patch, HBASE-3514-append-trunk-v3.patch, > HBASE-3514-append.patch, HBASE-3514-metaBlock-bsearch.patch > > > Remove double writes when block cache is specified, by using, only, the > ByteArrayDataStream. > baos is flushed with the compress stream on finishBlock. > On my machines HFilePerformanceEvaluation SequentialWriteBenchmark passes > from 4000ms to 2500ms. > Running SequentialWriteBenchmark for 1000000 rows took 4247ms. > Running SequentialWriteBenchmark for 1000000 rows took 4512ms. > Running SequentialWriteBenchmark for 1000000 rows took 4498ms. > Running SequentialWriteBenchmark for 1000000 rows took 2697ms. > Running SequentialWriteBenchmark for 1000000 rows took 2770ms. > Running SequentialWriteBenchmark for 1000000 rows took 2721ms. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira