Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/20295#discussion_r162981806 --- Diff: python/pyspark/serializers.py --- @@ -267,13 +267,13 @@ def load_stream(self, stream): """ Deserialize ArrowRecordBatches to an Arrow table and return as a list of pandas.Series. """ - from pyspark.sql.types import _check_dataframe_localize_timestamps + from pyspark.sql.types import _check_series_localize_timestamps import pyarrow as pa reader = pa.open_stream(stream) for batch in reader: # NOTE: changed from pa.Columns.to_pandas, timezone issue in conversion fixed in 0.7.1 - pdf = _check_dataframe_localize_timestamps(batch.to_pandas(), self._timezone) - yield [c for _, c in pdf.iteritems()] + yield [_check_series_localize_timestamps(c.to_pandas(), self._timezone) + for c in pa.Table.from_batches([batch]).itercolumns()] --- End diff -- I actually don't know what the comment above means. @BryanCutler do you remember?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org