ZanderXu created HADOOP-18389:
---------------------------------

             Summary: Limit stacked call of one connection in client to avoid 
possible oom in server
                 Key: HADOOP-18389
                 URL: https://issues.apache.org/jira/browse/HADOOP-18389
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: ZanderXu
            Assignee: ZanderXu
         Attachments: image-2022-08-04-00-22-28-865.png, 
image-2022-08-04-00-23-18-427.png

In our prod environment, we encountered an accident that JN OOM because 
Server#Connection#responseQueue used 97% memory.

After analyzed the memory of JN and found that there are 2w+ called stacked in 
one Server#Connection#responseQueue, because the network between NN and JN 
jitters with some tcp packet loss.

!image-2022-08-04-00-22-28-865.png|width=561,height=254!

!image-2022-08-04-00-23-18-427.png|width=559,height=356!

 

In this case, I think Client.java should support limit the stacked calls of one 
connection to avoid the possible OOM in Server.  When the number of stacked 
calls is more than the limit size, we can just throw one IOException to the 
method caller.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to