Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19226#discussion_r138797273 --- Diff: python/pyspark/serializers.py --- @@ -343,6 +343,8 @@ def _load_stream_without_unbatching(self, stream): key_batch_stream = self.key_ser._load_stream_without_unbatching(stream) val_batch_stream = self.val_ser._load_stream_without_unbatching(stream) for (key_batch, val_batch) in zip(key_batch_stream, val_batch_stream): + key_batch = list(key_batch) + val_batch = list(val_batch) --- End diff -- Should we fix the doc in `Serializer._load_stream_without_unbatching` to say, it returns iterator of iterables?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org