[ https://issues.apache.org/jira/browse/SPARK-43991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shuyouZZ updated SPARK-43991: ----------------------------- Summary: Use the compression codec set by the spark config file when write compact log (was: Use the compression codec set by the spark config file) > Use the compression codec set by the spark config file when write compact log > ----------------------------------------------------------------------------- > > Key: SPARK-43991 > URL: https://issues.apache.org/jira/browse/SPARK-43991 > Project: Spark > Issue Type: Bug > Components: Spark Core, Web UI > Affects Versions: 3.4.0 > Reporter: shuyouZZ > Priority: Major > > Currently, if enable rolling log in SHS, only {{originalFilePath}} is used to > determine the path of compact file. > {code:java} > override val logPath: String = originalFilePath.toUri.toString + > EventLogFileWriter.COMPACTED > {code} > If the user set {{spark.eventLog.compression.codec}} in spark conf, when the > log compact logic is triggered, the old event log file will be compacted and > use the compression codec set by the spark default config file. > {code:java} > protected val compressionCodec = > if (shouldCompress) { > Some(CompressionCodec.createCodec(sparkConf, > sparkConf.get(EVENT_LOG_COMPRESSION_CODEC))) > } else { > None > } > private[history] val compressionCodecName = compressionCodec.map { c => > CompressionCodec.getShortName(c.getClass.getName) > } > {code} > However, The compression codec used by EventLogFileReader to read log is > split from the log path, this will lead to EventLogFileReader can not read > the compacted log file normally. > {code:java} > def codecName(log: Path): Option[String] = { > // Compression codec is encoded as an extension, e.g. app_123.lzf > // Since we sanitize the app ID to not include periods, it is safe to > split on it > val logName = log.getName.stripSuffix(COMPACTED).stripSuffix(IN_PROGRESS) > logName.split("\\.").tail.lastOption > } > {code} > So we should improve the {{logPath}} method in class > CompactedEventLogFileWriter, use compression codec set by the spark default > config. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org