[ 
https://issues.apache.org/jira/browse/HBASE-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000572#comment-13000572
 ] 

ryan rawson commented on HBASE-3514:
------------------------------------

I looked at the unit test, and it looks like the situation was pretty broken, 
we were creating column families with BLOCKSIZE = Integer.MAX_VALUE (which is 
WAY wrong).

I think we should revert to an earlier version of the patch without the 
directWrite code, and also using the ByteBufferOutputStream class that is 
located in hbase.ipc package (feel free to move it to hbase.util).  It does not 
have the same resizing bugs that the ByteArrayOutputStream does.  

Another issue with this is what happens when blocks are really big - this will 
normally be due to really large single KVs, we will require another allocation 
of a given buffer. Given that there is no size restriction to block caching, I 
think this isn't an issue for this patch, and we can think about it elsewhere.

> Speedup HFile.Writer append
> ---------------------------
>
>                 Key: HBASE-3514
>                 URL: https://issues.apache.org/jira/browse/HBASE-3514
>             Project: HBase
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.90.0
>            Reporter: Matteo Bertozzi
>            Priority: Minor
>         Attachments: HBASE-3514-append-0.90-v2.patch, 
> HBASE-3514-append-0.90-v3.patch, HBASE-3514-append-0.90.patch, 
> HBASE-3514-append-trunk-v2.patch, HBASE-3514-append-trunk-v3.patch, 
> HBASE-3514-append.patch, HBASE-3514-metaBlock-bsearch.patch
>
>
> Remove double writes when block cache is specified, by using, only, the 
> ByteArrayDataStream.
> baos is flushed with the compress stream on finishBlock.
> On my machines HFilePerformanceEvaluation SequentialWriteBenchmark passes 
> from 4000ms to 2500ms.
> Running SequentialWriteBenchmark for 1000000 rows took 4247ms.
> Running SequentialWriteBenchmark for 1000000 rows took 4512ms.
> Running SequentialWriteBenchmark for 1000000 rows took 4498ms.
> Running SequentialWriteBenchmark for 1000000 rows took 2697ms.
> Running SequentialWriteBenchmark for 1000000 rows took 2770ms.
> Running SequentialWriteBenchmark for 1000000 rows took 2721ms.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to