Juliusz Sompolski created SPARK-54637:
-----------------------------------------
Summary: Testing connect SQL APIs with client and server in the
same JVM
Key: SPARK-54637
URL: https://issues.apache.org/jira/browse/SPARK-54637
Project: Spark
Issue Type: Improvement
Components: Connect
Affects Versions: 4.1.0
Reporter: Juliusz Sompolski
In Spark 3.5, a testing trait SparkConnectServerTest was introduced that helped
test Spark Connect Service with a SparkConnectClient in the same JVM proccess,
which tested real Spark Connect code paths (SparkConnectClient communicating
with the server over actual connection to the localhost server). Before that,
using RemoteSparkSession, server was started in a separate process.
It helped
* testability: can trigger stuff from the client, then have verification code
checking stuff server side. Can also do some more internal server side setup to
test specific things.
* debugging, as both client and server can be easily connected to by a
debugger.
At that time, it was impossible to test Spark Connect client SQL APIs
(SparkSession, Dataset) this way, because they were in the same namespace as
server, and hence couldn't be classloaded together.
Since Spark 4.0, there is a new API layer that makes it possible for connect
and classic implementation of the interfaces to coexist. With that, testing can
be extended to use actual SparkSession and other APIs, instead of having to
construct tests using more raw APIs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]