[ https://issues.apache.org/jira/browse/SPARK-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shixiong Zhu updated SPARK-17475: --------------------------------- Component/s: (was: DStreams) Structured Streaming > HDFSMetadataLog should not leak CRC files > ----------------------------------------- > > Key: SPARK-17475 > URL: https://issues.apache.org/jira/browse/SPARK-17475 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming > Affects Versions: 2.0.1 > Reporter: Frederick Reiss > Assignee: Frederick Reiss > Fix For: 2.1.0 > > > When HDFSMetadataLog uses a log directory on a filesystem other than HDFS > (i.e. NFS or the driver node's local filesystem), the class leaves orphan > checksum (CRC) files in the log directory. The files have names that follow > the pattern "..[long UUID hex string].tmp.crc". These files exist because > HDFSMetaDataLog renames other temporary files without renaming the > corresponding checksum files. There is one CRC file per batch, so the > directory fills up quite quickly. > I'm not certain, but this problem might also occur on certain versions of the > HDFS APIs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org