Aaditya Ramesh created SPARK-19525:
--------------------------------------

             Summary: Enable Compression of Spark Streaming Checkpoints
                 Key: SPARK-19525
                 URL: https://issues.apache.org/jira/browse/SPARK-19525
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 2.1.0
            Reporter: Aaditya Ramesh


In our testing, compressing partitions while writing them to checkpoints on 
HDFS using snappy helped performance significantly while also reducing the 
variability of the checkpointing operation. In our tests, checkpointing time 
was reduced by 3X, and variability was reduced by 2X for data sets of 
compressed size approximately 1 GB.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to