Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/18388
  
    ok sorry I forgot you had the screenshot there. so as you mention in that 
post if we are just creating to many outboundbuffers before they can actual be 
sent over the network then we should try to add some flow control.  did you 
check to see what the buffers were for?  How many connections did you have any 
how many blocks was each fetching?  a million is a lot either way.  but I'm 
assuming its something like you had 500 connections each fetching 2000 blocks.  
If that is the case it seems like it would be good to add flow control here 
rather then just disconnecting based on memory. really having both would be 
good, this as a fall back, but the flow control part should allow everyone to 
start fetching without rejecting a bunch, especially if the network can't push 
it out that fast anyway.
    
    For instance only create a handful of those outgoing buffers and wait to 
get  successfully sent messages back for the those before creating more.   This 
might be a bit more complex
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to