[GitHub] [spark] SaurabhChawla100 edited a comment on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

GitBox Thu, 13 Aug 2020 03:49:32 -0700


SaurabhChawla100 edited a comment on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-673398423



   > sorry I'm still not seeing any difference here then increasing the size of 
the current queue? If both are not really allocating memory for the entire 
amount until runtime then either way you have to set memory of driver to be the 
maximum amount used. why not set the queue size to size + size * 
spark.set.optmized.event.queue.threshold?
   > If you look at the driver memory used, I don't think that is very 
reliable. They could change very quickly.
   
   **If you look at the driver memory used, I don't think that is very 
reliable. They could change very quickly.** - Yes agree used driver memory can 
change quickly, That why added the validation of used driver memory threshold 
as 90 percent. We can make it 95 percent or configure through Conf instead of 
directly using in the code. Here if we consider the VariableLinkedBlockingQueue 
then we can change the size of the queue at run time.
   
   Ok If we take out consideration of VariableLinkedBlockingQueue and Driver 
Memory validation to handle this issue, than I can think of below 2 approach 
   
   1) Either to Set Queue size as  size + size * 
spark.set.optmized.event.queue.threshold -> The problem with this approach is 
that event drop can still happen after increasing the size if the flow of 
incoming events is at higher rate compare to consumer rate . In this case also 
critical events can be dropped from the queue (executor Mangament 
Queue,Appstatus Queue etc). But this will help in the scenario where there is 
high rate of event at one point of time and on increasing with 
spark.set.optmized.event.queue.threshold can help in preventing the overflow 
the queue here.
   
   2) Make the size of the queue as Unbounded -> The only problem with this 
approach is Driver getting OOM. But If the event are really important and we 
cannot avoid dropping the event while running Spark Job. For example - 
   a) For some user while running the Spark Job they cannot avoid dropping of 
events from Appstatus Queue like in case of notebooks(jupyter/Zeepline)
   b) For other user they want to have the dynamic allocation of resources 
(upscaling/ down scaling) needs to be consistent.
   c) For other user they want to have all the events in the eventlog File, 
which they used later for ML models to analyse the Spark Jobs.
   There is need to provide higher driver memory in this case but some spark 
jobs does not require this much driver memory. We cannot make default high 
driver memory since some application can run fine with default driver memory 1 
gb.
   
   We can think of making it per queue basis any such approach based on the 
user requirement like I have already done in this PR using the 
conf(spark.set.optmized.event.queue) for VariableLinkedBlockingQueue  at the 
start of application.
   
   Happy to discuss any other idea for tuning these Queues 
   
   @Ngone51  @tgravescs 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SaurabhChawla100 edited a comment on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

Reply via email to