[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-10 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1019820867 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +120,93 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-10 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018837877 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -83,7 +83,6 @@ message Response { int64 uncompressed_bytes = 2; Review Comment: @zhe

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018689309 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +123,97 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018688539 ## python/pyspark/sql/connect/client.py: ## @@ -400,6 +400,14 @@ def _execute_and_fetch(self, req: pb2.Request) -> typing.Optional[pandas.DataFra if len

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018688345 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -197,6 +197,17 @@ def test_range(self): .equals(self.spark.range(start=0, end=10, step

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018687996 ## sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala: ## @@ -128,6 +128,97 @@ private[sql] object ArrowConverters extends Logging {

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018685534 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -83,7 +83,6 @@ message Response { int64 uncompressed_bytes = 2; Review Comment: @gru

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018682109 ## python/pyspark/sql/connect/client.py: ## @@ -400,6 +400,14 @@ def _execute_and_fetch(self, req: pb2.Request) -> typing.Optional[pandas.DataFra if len

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018681767 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +123,97 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018681018 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +123,97 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018680299 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +123,97 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018679792 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -114,10 +123,97 @@ class SparkConnectStreamHandler(r

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-09 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1018678853 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -48,19 +51,25 @@ class SparkConnectStreamHandler(res

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-03 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1013556642 ## sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala: ## @@ -128,6 +128,65 @@ private[sql] object ArrowConverters extends Logging {

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-03 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1013555606 ## python/pyspark/sql/connect/client.py: ## @@ -182,6 +191,10 @@ def _to_pandas(self, plan: pb2.Plan) -> Optional[pandas.DataFrame]: req = pb2.Request()

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-03 Thread GitBox
HyukjinKwon commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1013528536 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,7 +131,36 @@ class SparkConnectStreamHandler(re