Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/23241#discussion_r239609592 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -197,4 +201,8 @@ class ZStdCompressionCodec(conf: SparkConf) extends CompressionCodec { // avoid overhead excessive of JNI call while trying to uncompress small amount of data. new BufferedInputStream(new ZstdInputStream(s), bufferSize) } + + override def zstdEventLogCompressedInputStream(s: InputStream): InputStream = { + new BufferedInputStream(new ZstdInputStream(s).setContinuous(true), bufferSize) --- End diff -- > Is it actually desirable to not fail on a partial frame? If you're reading a shuffle file compressed with zstd, and the shuffle file is corrupted somehow, this change may be allowing Spark to read incomplete shuffle data...
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org