[ 
https://issues.apache.org/jira/browse/SPARK-37640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muhong updated SPARK-37640:
---------------------------
    Description: 
when we set "{{{}spark.eventLog.rolling.enabled{}}} =true", the eventlog will 
be roll and compact(when set "spark.eventLog.compression.codec"), the directory 
tree like this

root dir: /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1

file in dir:

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

 
/spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd

......

......

 

a "long run" spark application, the history server will not clean the 
'events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd' file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1, so the size of 
directory will be bigger and bigger during the whole lifetime of app. 

so i think we should provide a mechanism for user to clean the 
“events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file in 
/spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1 directory

 

  was:when


> rolled event log still need be clean after compact
> --------------------------------------------------
>
>                 Key: SPARK-37640
>                 URL: https://issues.apache.org/jira/browse/SPARK-37640
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.1.1
>            Reporter: muhong
>            Priority: Major
>
> when we set "{{{}spark.eventLog.rolling.enabled{}}} =true", the eventlog will 
> be roll and compact(when set "spark.eventLog.compression.codec"), the 
> directory tree like this
> root dir: /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1
> file in dir:
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
>  
> /spark2xJobHistory2x/eventlog_v2_application_xxxx_xxx_1/events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd
> ......
> ......
>  
> a "long run" spark application, the history server will not clean the 
> 'events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd' file in 
> /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1, so the size 
> of directory will be bigger and bigger during the whole lifetime of app. 
> so i think we should provide a mechanism for user to clean the 
> “events_xxxx_application_xxxxxxxxxxx_xxxx_1.zstd” file in 
> /spark2xJobHistory2x/eventlog_v2_application_xxxxxxxxxxx_xxx_1 directory
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to