[jira] [Comment Edited] (HBASE-14937) Make rpc call timeout for replication adaptive

Andrew Purtell (JIRA) Fri, 18 Dec 2015 08:56:38 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064193#comment-15064193
 ]


Andrew Purtell edited comment on HBASE-14937 at 12/18/15 4:55 PM:
------------------------------------------------------------------

When replication is down, say because of a network partition or temporary issue 
on one cluster, RPC calls cannot succeed and will time out. However once the 
network or cluster is back in operation we want replication activity to resume 
as quickly as possible. Does this change prevent timely restart of replication 
activity? Won't we potentially be waiting for a long time for the current call 
to timeout before probing with another attempt? Would the time we might wait 
unnecessarily increase as the duration of the outage increases, making a long 
outage a really really long outage?


was (Author: apurtell):
When replication is down, say because of a network partition or temporary issue 
on one cluster, RPC calls can of course time out. Once the network or cluster 
is back in operation we want replication activity to resume as quickly as 
possible. Does this change prevent timely restart of replication activity? 
Won't we potentially be waiting for a long time for the current call to timeout 
before probing with another? Would the time we might wait unnecessarily 
increase as the duration of the outage increases, making a long outage a really 
really long outage?

> Make rpc call timeout for replication adaptive
> ----------------------------------------------
>
>                 Key: HBASE-14937
>                 URL: https://issues.apache.org/jira/browse/HBASE-14937
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>              Labels: replication
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: HBASE-14937.patch
>
>
> When peer cluster replication is disabled and lot of writes are happening in 
> active cluster and later on peer cluster replication is enabled then there 
> are chances that replication requests to peer cluster may time out.
> This is possible after HBASE-13153 and it can also happen with many and many 
> WAL data replication still pending to replicate.
> Approach to this problem will be discussed in the comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-14937) Make rpc call timeout for replication adaptive

Reply via email to