Andrew Purtell created HBASE-10121:
--------------------------------------

             Summary: Abort wedged Calls after a timeout
                 Key: HBASE-10121
                 URL: https://issues.apache.org/jira/browse/HBASE-10121
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.94.11
            Reporter: Andrew Purtell
         Attachments: screenshot.jpg

Saw this on a mail to user@. 

"REPL IPC Server handler $N on $PORT WAITING Waiting for a call (since 22 hrs, 
57mins, 38sec ago)"

I don't think this is a TCP level issue. We are enabling keepalives on 
connections by default. Either we failed to remove the call upon exception or 
the remote is alive but not sending.

Looking at the IPC server code, I don't see where we abort and clean up wedged 
Calls after some timeout. Regardless of the other issues here, should we do 
that?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to