Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19770#discussion_r154814962
  
    --- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -643,6 +633,44 @@ private[history] class FsHistoryProvider(conf: 
SparkConf, clock: Clock)
         } finally {
           iterator.foreach(_.close())
         }
    +
    +    // Clean corrupt or empty files that may have accumulated.
    +    if (AGGRESSIVE_CLEANUP) {
    +      var untracked: Option[KVStoreIterator[LogInfo]] = None
    +      try {
    +        untracked = Some(listing.view(classOf[LogInfo])
    --- End diff --
    
    So I spent some time reading my own patch and it's covering a slightly 
different case. My patch covers deleting SHS state when files are deleted, this 
covers deleting files that the SHS decides are broken. I still think that some 
code / state can be saved by handling both similarly - still playing with my 
code, though.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to