[ 
https://issues.apache.org/jira/browse/HBASE-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999943#comment-14999943
 ] 

Lars Hofhansl commented on HBASE-14791:
---------------------------------------

One tricky aspect of this is that we generally want to keep the order of the 
deletes w.r.t. puts as much as possible.
If have one buffering mechanism for puts and another for deletes that is hard 
to maintain.

For correctness it is enough to ensure that deletes are shipped after the puts, 
not sure that's easy to do, though.
Then again in cases where we want to ship the deletes, there's better an 
appropriate setup on the receiving to keep delete markers around correct, 
otherwise it makes not sense to ship them... not maybe not an issue at all?


> [0.98] CopyTable is extremely slow when moving delete markers
> -------------------------------------------------------------
>
>                 Key: HBASE-14791
>                 URL: https://issues.apache.org/jira/browse/HBASE-14791
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.16
>            Reporter: Lars Hofhansl
>            Assignee: Alex Araujo
>
> We found that some of our copy table job run for many hours, even when there 
> isn't that much data to copy.
> [~vik.karma] did his magic and found that the issue is with copying delete 
> markers (we use raw mode to also move deletes across).
> Looking at the code in 0.98 it's immediately obvious that deletes (unlike 
> puts) are not batched and hence sent to the other side one by one, causing a 
> network RTT for each delete marker.
> Looks like in trunk it's doing the right thing (using BufferedMutators for 
> all mutations in TableOutputFormat). So likely only a 0.98 (and 1.0, 1.1, 
> 1.2?) issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to