[ 
https://issues.apache.org/jira/browse/SPARK-37640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muhong updated SPARK-37640:
---------------------------
    Description: 
when we set "{{{}spark.eventLog.rolling.enabled{}}} =true", the eventlog will 
be roll and compact(when set "spark.eventLog.compression.codec"), the directory 
tree like this

root dir: /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1

file in dir:

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

......

......

 

a "long run" spark application, the history server will not clean the 
'events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd' file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1, so the size of 
directory will be bigger and bigger during the whole lifetime of app. 

so i think we should provide a mechanism for user to clean the 
“events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1 directory

 

our solution:add a clean function in 
“[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#checkForLogs]”,this
 function will list the file in 
“/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1” and clean the 
“events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file according to the config 
"spark.history.fs.cleaner.maxAge". this solve the unlimited space increase ,but 
will loss some event,especially the start event,this will lead the history can 
not show the eventlog correctly。

 

so we will will have a much more proper way to solve this

  was:
when we set "{{{}spark.eventLog.rolling.enabled{}}} =true", the eventlog will 
be roll and compact(when set "spark.eventLog.compression.codec"), the directory 
tree like this

root dir: /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1

file in dir:

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

......

......

 

a "long run" spark application, the history server will not clean the 
'events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd' file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1, so the size of 
directory will be bigger and bigger during the whole lifetime of app. 

so i think we should provide a mechanism for user to clean the 
“events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1 directory

 

our solution:add a clean function in 
“https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#checkForLogs”,this
 function will list the file in 
“/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1” and clean the 
“events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file according to the config 
"spark.history.fs.cleaner.maxAge"


> rolled event log still need be clean after compact
> --------------------------------------------------
>
>                 Key: SPARK-37640
>                 URL: https://issues.apache.org/jira/browse/SPARK-37640
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.1.1
>            Reporter: muhong
>            Priority: Major
>
> when we set "{{{}spark.eventLog.rolling.enabled{}}} =true", the eventlog will 
> be roll and compact(when set "spark.eventLog.compression.codec"), the 
> directory tree like this
> root dir: /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1
> file in dir:
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
> ......
> ......
>  
> a "long run" spark application, the history server will not clean the 
> 'events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd' file in 
> /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1, so the size 
> of directory will be bigger and bigger during the whole lifetime of app. 
> so i think we should provide a mechanism for user to clean the 
> “events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file in 
> /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1 directory
>  
> our solution:add a clean function in 
> “[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#checkForLogs]”,this
>  function will list the file in 
> “/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1” and clean 
> the “events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file according to the 
> config "spark.history.fs.cleaner.maxAge". this solve the unlimited space 
> increase ,but will loss some event,especially the start event,this will lead 
> the history can not show the eventlog correctly。
>  
> so we will will have a much more proper way to solve this



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to