[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-12-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22275 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-09 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r232420076 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,28 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-09 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r232420015 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,34 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-08 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r232145973 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,28 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-06 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r231311398 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,28 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-02 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r230423471 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,28 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-10-30 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r229522939 --- Diff: python/pyspark/sql/tests.py --- @@ -4923,6 +4923,28 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-10-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r223197940 --- Diff: python/pyspark/sql/tests.py --- @@ -4434,6 +4434,12 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r223116201 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r223116082 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219556033 --- Diff: python/pyspark/serializers.py --- @@ -208,8 +214,26 @@ def load_stream(self, stream): for batch in reader: yield

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219558311 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId =

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219556534 --- Diff: python/pyspark/serializers.py --- @@ -208,8 +214,26 @@ def load_stream(self, stream): for batch in reader: yield

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219561178 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId =

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219557215 --- Diff: python/pyspark/sql/tests.py --- @@ -4434,6 +4434,12 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r219404072 --- Diff: python/pyspark/sql/tests.py --- @@ -4434,6 +4434,12 @@ def test_timestamp_dst(self): self.assertPandasEqual(pdf,

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-30 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r214131313 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-30 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r214129436 --- Diff: python/pyspark/serializers.py --- @@ -187,9 +187,15 @@ def loads(self, obj): class ArrowStreamSerializer(Serializer): """

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r213862165 --- Diff: python/pyspark/serializers.py --- @@ -187,9 +187,15 @@ def loads(self, obj): class ArrowStreamSerializer(Serializer): """ -

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r213860328 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId =

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-29 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/22275 [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send out-of-order record batches to improve performance ## What changes were proposed in this pull request? When executing `toPandas`