[ 
https://issues.apache.org/jira/browse/HBASE-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999243#comment-14999243
 ] 

Alex Araujo commented on HBASE-14791:
-------------------------------------

The patch for HBASE-12728 moved buffering of Puts from HTable into 
BufferedMutator for 1.0+. Since BufferedMutator is not Put specific, it also 
made TableOutputFormat and MultiTableOutputFormat use BufferedMutator for all 
Mutation types.

In 0.98 there is no buffering of Deletes in HTable or elsewhere as far as I can 
tell. Essentially, we'd need to implement a basic BufferedMutator and use it 
for both OutputFormat types. The downside is that we would be duplicating some 
of the buffering code in HTable.

> [0.98] CopyTable is extremely slow when moving delete markers
> -------------------------------------------------------------
>
>                 Key: HBASE-14791
>                 URL: https://issues.apache.org/jira/browse/HBASE-14791
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.16
>            Reporter: Lars Hofhansl
>            Assignee: Alex Araujo
>
> We found that some of our copy table job run for many hours, even when there 
> isn't that much data to copy.
> [~vik.karma] did his magic and found that the issue is with copying delete 
> markers (we use raw mode to also move deletes across).
> Looking at the code in 0.98 it's immediately obvious that deletes (unlike 
> puts) are not batched and hence sent to the other side one by one, causing a 
> network RTT for each delete marker.
> Looks like in trunk it's doing the right thing (using BufferedMutators for 
> all mutations in TableOutputFormat). So likely only a 0.98 (and 1.0, 1.1, 
> 1.2?) issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to