[ https://issues.apache.org/jira/browse/SPARK-28867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949767#comment-16949767 ]
Imran Rashid commented on SPARK-28867: -------------------------------------- This is closely related to SPARK-20656. Its not *quite* a duplicate, because that was about reparsing the logs for the same application within the same SHS instance -- so the SHS still had whatever state stored in memory. Here you're also talking about speeding up parsing of those files even when the SHS is restarted, which also requires some way to restore any state across SHS restarts. > InMemoryStore checkpoint to speed up replay log file in HistoryServer > --------------------------------------------------------------------- > > Key: SPARK-28867 > URL: https://issues.apache.org/jira/browse/SPARK-28867 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: wuyi > Priority: Major > > HistoryServer now could be very slow to replay a large log file at the first > time and it always re-replay an inprogress log file after it changes. we > could periodically checkpoint InMemoryStore to speed up replay log file. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org