Hai Tao created SPARK-39225: ------------------------------- Summary: History Server initial scan may block new eventlog update Key: SPARK-39225 URL: https://issues.apache.org/jira/browse/SPARK-39225 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.2.1, 3.1.2, 3.0.3 Reporter: Hai Tao Fix For: 3.2.1, 3.1.2, 3.0.3
Current Spark History Server suffers when there are a large number of eventlog files under eventLog.dir: when a SHS starts, the initial scan may take a long time, and new eventlog files would not be scanned/parsed until the initial scan completes. For example, if the initial scan takes 1-2 days(this is not uncommon in large environments), the newly finished spark jobs would not show up in SHS since their eventlog files are not scanned/parsed until the initial scan process finishes. This would result in a 1-2 days SHS malfunctioning since the newly finished spark jobs are most likely to be queried by users. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org