SaurabhChawla100 edited a comment on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-674664896


   > Yeah so you have described the affects of it dropping the events, which I 
know. The thing I want to know is why it dropped the events in your cases.
   
   There are scenarios where in stage there are large number of task and all 
task completed in very small time resulting in the queue to fill up very fast 
and dropping for event after some time from those Queues (executormanagement 
Queue, AppStatus Queue, even in Event Log Queue). For us executormanagement and 
AppStatus matters most since we have dynamic allocation enabled and most of the 
jobs are on the Notebooks( jupyter/zeppelin)
   
   > I'm not sure the circumstances you are running or seeing these issues, is 
this just individuals running their own clusters. If it's a centralized 
cluster, why not have some cluster defaults and just raise the default values 
there? This is essentially just doing that.
   
   We are already setting some value in  cluster defaults . But as I said 
earlier also "There is no fixed size of the Queue which can be used in all the 
Spark Jobs and even for the Same Spark Job that ran on different input set on 
daily basis"
   Thats the reason we are looking for some dynamism in this process queue and 
some way to reduce human efforts to change the Queue size. IMO changing the 
driver memory(increasing / decreasing the driver memory) is easier compare to 
setting the Queue size(also there are multiple Queues) and also debugging the 
application failed due to OOM is easier to debug than debugging the impact of 
Event drop, since there are many cases to look upon this event drop 
(application hangs, wasted resources etc.)
   
   @tgravescs @Ngone51 @itskals 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to