Aaditya Ramesh created SPARK-19525: -------------------------------------- Summary: Enable Compression of Spark Streaming Checkpoints Key: SPARK-19525 URL: https://issues.apache.org/jira/browse/SPARK-19525 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 2.1.0 Reporter: Aaditya Ramesh
In our testing, compressing partitions while writing them to checkpoints on HDFS using snappy helped performance significantly while also reducing the variability of the checkpointing operation. In our tests, checkpointing time was reduced by 3X, and variability was reduced by 2X for data sets of compressed size approximately 1 GB. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org