[SPARK-12091] [PYSPARK] Deprecate the JAVA-specific deserialized storage levels
The current default storage level of Python persist API is MEMORY_ONLY_SER. This is different from the default level MEMORY_ONLY in the official document and RDD APIs. davies Is this inconsistency intentional? Thanks! Updates: Since the data is always serialized on the Python side, the storage levels of JAVA-specific deserialization are not removed, such as MEMORY_ONLY. Updates: Based on the reviewers' feedback. In Python, stored objects will always be serialized with the [Pickle](https://docs.python.org/2/library/pickle.html) library, so it does not matter whether you choose a serialized level. The available storage levels in Python include `MEMORY_ONLY`, `MEMORY_ONLY_2`, `MEMORY_AND_DISK`, `MEMORY_AND_DISK_2`, `DISK_ONLY`, `DISK_ONLY_2` and `OFF_HEAP`. Author: gatorsmile <gatorsm...@gmail.com> Closes #10092 from gatorsmile/persistStorageLevel. Project: http://git-wip-us.apache.org/repos/asf/bahir/repo Commit: http://git-wip-us.apache.org/repos/asf/bahir/commit/6b49590b Tree: http://git-wip-us.apache.org/repos/asf/bahir/tree/6b49590b Diff: http://git-wip-us.apache.org/repos/asf/bahir/diff/6b49590b Branch: refs/heads/master Commit: 6b49590b631651df6c6f9cb6cf44b206b3067411 Parents: c615575 Author: gatorsmile <gatorsm...@gmail.com> Authored: Fri Dec 18 20:06:05 2015 -0800 Committer: Davies Liu <davies....@gmail.com> Committed: Fri Dec 18 20:06:05 2015 -0800 ---------------------------------------------------------------------- streaming-mqtt/python/dstream.py | 4 ++-- streaming-mqtt/python/mqtt.py | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/bahir/blob/6b49590b/streaming-mqtt/python/dstream.py ---------------------------------------------------------------------- diff --git a/streaming-mqtt/python/dstream.py b/streaming-mqtt/python/dstream.py index b994a53..adc2651 100644 --- a/streaming-mqtt/python/dstream.py +++ b/streaming-mqtt/python/dstream.py @@ -208,10 +208,10 @@ class DStream(object): def cache(self): """ Persist the RDDs of this DStream with the default storage level - (C{MEMORY_ONLY_SER}). + (C{MEMORY_ONLY}). """ self.is_cached = True - self.persist(StorageLevel.MEMORY_ONLY_SER) + self.persist(StorageLevel.MEMORY_ONLY) return self def persist(self, storageLevel): http://git-wip-us.apache.org/repos/asf/bahir/blob/6b49590b/streaming-mqtt/python/mqtt.py ---------------------------------------------------------------------- diff --git a/streaming-mqtt/python/mqtt.py b/streaming-mqtt/python/mqtt.py index 1ce4093..3a515ea 100644 --- a/streaming-mqtt/python/mqtt.py +++ b/streaming-mqtt/python/mqtt.py @@ -28,7 +28,7 @@ class MQTTUtils(object): @staticmethod def createStream(ssc, brokerUrl, topic, - storageLevel=StorageLevel.MEMORY_AND_DISK_SER_2): + storageLevel=StorageLevel.MEMORY_AND_DISK_2): """ Create an input stream that pulls messages from a Mqtt Broker.