Yuheng Chang created SPARK-54452:
------------------------------------
Summary: Fix empty response from SparkConnect server for
spark.sql(...) inside FlowFunction
Key: SPARK-54452
URL: https://issues.apache.org/jira/browse/SPARK-54452
Project: Spark
Issue Type: Sub-task
Components: Declarative Pipelines
Affects Versions: 4.1.0
Reporter: Yuheng Chang
n PR SPARK-54020, we added support for {{spark.sql(...)}} inside a FlowFunction
for SDP. For these calls, instead of eagerly executing the SQL, the Spark
Connect server should return the raw logical plan to the client and defer
execution to the flow function.
However, in that PR we constructed the response object but forgot to actually
return it to the Spark Connect client, so the client received an empty response.
This went unnoticed in tests because, when the client sees an empty
{{spark.sql(...)}} response, [it falls back to creating an empty DataFrame
holding the raw logical
plan|https://github.com/apache/spark/blob/master/python/pyspark/sql/connect/session.py#L829-L835],
which happens to match the desired behavior. We should fixe the bug by
returning the proper response instead of relying on that implicit fallback.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]