[ https://issues.apache.org/jira/browse/KUDU-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893174#comment-16893174 ]
Adar Dembo commented on KUDU-2192: ---------------------------------- [~kwho] is there anything left to do for this JIRA? > KRPC should have a timer to close stuck connections > --------------------------------------------------- > > Key: KUDU-2192 > URL: https://issues.apache.org/jira/browse/KUDU-2192 > Project: Kudu > Issue Type: Improvement > Components: rpc > Reporter: Michael Ho > Assignee: Michael Ho > Priority: Major > > If the remote host goes down or its network gets unplugged, all pending RPCs > to that host will be stuck if there is no timeout specified. While those RPCs > which have finished sending their payloads or those which haven't started > sending payloads can be cancelled quickly, those in mid-transmission (i.e. an > RPC at the front of the outbound queue with part of its payload sent already) > cannot be cancelled until the payload has been completely sent. Therefore, > it's beneficial to have a timeout to kill a connection if it's not making any > progress for an extended period of time so the RPC will fail and get unstuck. > The timeout may need to be conservatively large to avoid aggressive closing > of connections due to transient network issue. One can consider augmenting > the existing maintenance thread logic which checks for idle connection to > check for this kind of timeout. Please feel free to propose other > alternatives (e.g. TPC keepalive timeout) in this JIRA. -- This message was sent by Atlassian JIRA (v7.6.14#76016)