[
https://issues.apache.org/jira/browse/HBASE-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631769#action_12631769
]
Jean-Daniel Cryans commented on HBASE-748:
------------------------------------------
I gave more thought to st^ack's idea of buffering the edits and I think it
would be nice to implement it. This is how I see it.
We keep an ArrayList of RowUpdates in HTable so that we have a cache per table.
It should be of a configurable maximum size in bytes. Maybe a default of 64M?
It should also be configurable when creating a HTable.
The RowUpdate class should be able to give us the size of all the
BatchOperation it contains. It should fairly easy to do by asking each BO their
value's length.
We can compute the size of the RowUpdate either at commit time or we can do it
after each put. I would prefer after each put so we skip the iteration.
In the case of auto-flushing, I see two ways to detect that the buffer is full.
Either at commit time or in a separate thread like the Flusher currently works.
The first is very easy to implement but blocks the commits. The second is
harder to implement but doesn't block the commits. I think that for 0.19.0 we
could implement the first one.
The other case is that auto-flushing is disabled and then it is the user's
responsibility to call something like HTable.flushEdits().
Any comments?
> Add an efficient way to batch update many rows
> ----------------------------------------------
>
> Key: HBASE-748
> URL: https://issues.apache.org/jira/browse/HBASE-748
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.1.3, 0.2.0
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.19.0
>
>
> HBASE-747 introduced a simple way to batch update many rows. The goal of this
> issue is to have an enhanced version that will send many rows in a single RPC
> to each region server. To do this, the client code will have to figure which
> rows goes to which server, group them accordingly and then send them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.