[GitHub] [spark] LantaoJin opened a new pull request #29869: [SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

GitBox Thu, 24 Sep 2020 21:15:08 -0700


LantaoJin opened a new pull request #29869:
URL: https://github.com/apache/spark/pull/29869



   ### What changes were proposed in this pull request?
   Add a configuration to 
`LISTENER_BUS_ALLOW_EXTERNAL_ACCUMULATORS_ENTER_EVENT` and update the external 
accumulators which name is not started with 
`InternalAccumulator.METRICS_PREFIX` before entering into `eventProcessLoop`. 
This could avoid the driver holding the heavy accumulator in the event loop for 
a long time. The memory could be release in young GC.
   
   
   ### Why are the changes needed?
   We use Spark + Delta Lake, recently we find our Spark driver faced full GC 
problem (very heavy) when users submit a MERGE INTO query. The driver held over 
100GB memory (depends on how much the max heap size set) and can not be GC 
forever. By making a heap dump we found the root cause.
   
   <img width="839" alt="Screen Shot 2020-09-25 at 11 35 01 AM" 
src="https://user-images.githubusercontent.com/1853780/94225514-c3159380-ff27-11ea-9b29-22bb4fa77533.png";>
   
   From above heap dump, Delta uses a SetAccumulator to records touched files 
names
   ```scala
       // Accumulator to collect all the distinct touched files
       val touchedFilesAccum = new SetAccumulator[String]()
       spark.sparkContext.register(touchedFilesAccum, TOUCHED_FILES_ACCUM_NAME)
   
       // UDFs to records touched files names and add them to the accumulator
       val recordTouchedFileName = udf { (fileName: String) => {
         touchedFilesAccum.add(fileName)
         1
       }}.asNondeterministic()
   ```
   
   In a big query, each task may hold thousands of file names, and if a stage 
contains dozens of thousands of tasks, DAGscheduler may hold millions of 
`CompletionEvent`. And each `CompletionEvent` holds the thousands of file names 
in its `accumUpdates`. All accumulator objects will use Spark listener event to 
deliver to the event loop and even a full GC can not release memory.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Add a UT
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin opened a new pull request #29869: [SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

Reply via email to