HeartSaVioR commented on issue #23850: [SPARK-26949][SS] Prevent 'purge' to remove needed batch files in CompactibleFileStreamLog URL: https://github.com/apache/spark/pull/23850#issuecomment-501176314 @dongjoon-hyun Thanks for taking a look at the patch. > If CompactibleFileStreamLog calls purge only when isCompactionBatch returns true, does purge fail in that case? Let me clear the issue - the condition which breaks internal state is, batches to purge contain the latest compaction batch, as further batches will refer the compaction batch. I've described alternatives as well, so please take a look at previous comment: https://github.com/apache/spark/pull/23850#issuecomment-465861957 Btw, even we could purge batches earlier than latest compaction batch, CompactibleFileStreamLog also does the clean up in `deleteExpiredLog` so it is actually not needed. (I'd like to let CompactibleFileStreamLog be responsible to take care about logs by itself.)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org