Marcelo Vanzin created SPARK-29105:
--------------------------------------

             Summary: SHS may delete driver log file of in progress application
                 Key: SPARK-29105
                 URL: https://issues.apache.org/jira/browse/SPARK-29105
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: Marcelo Vanzin


There's an issue with how the SHS cleans driver logs that is similar to the 
problem of event logs: because the file size is not updated when you write to 
it, the SHS fails to detect activity and thus may delete the file while it's 
still being written to.

SPARK-24787 added a workaround in the SHS so that it can detect that situation 
for in-progress apps, replacing the previous solution which was too slow for 
event logs.

But that doesn't work for driver logs because they do not follow the same 
pattern (different file names for in-progress files), and thus would require 
the SHS to open the driver log files on every scan, which is expensive.

The old approach (using the {{hsync}} API) seems to be a good match for the 
driver logs, though, which don't slow down the listener bus like event logs do.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to