Juliusz Sompolski created SPARK-54637:
-----------------------------------------

             Summary: Testing connect SQL APIs with client and server in the 
same JVM
                 Key: SPARK-54637
                 URL: https://issues.apache.org/jira/browse/SPARK-54637
             Project: Spark
          Issue Type: Improvement
          Components: Connect
    Affects Versions: 4.1.0
            Reporter: Juliusz Sompolski


In Spark 3.5, a testing trait SparkConnectServerTest was introduced that helped 
test Spark Connect Service with a SparkConnectClient in the same JVM proccess, 
which tested real Spark Connect code paths (SparkConnectClient communicating 
with the server over actual connection to the localhost server). Before that, 
using RemoteSparkSession, server was started in a separate process.

It helped
 * testability: can trigger stuff from the client, then have verification code 
checking stuff server side. Can also do some more internal server side setup to 
test specific things.
 * debugging, as both client and server can be easily connected to by a 
debugger.

At that time, it was impossible to test Spark Connect client SQL APIs 
(SparkSession, Dataset) this way, because they were in the same namespace as 
server, and hence couldn't be classloaded together.

Since Spark 4.0, there is a new API layer that makes it possible for connect 
and classic implementation of the interfaces to coexist. With that, testing can 
be extended to use actual SparkSession and other APIs, instead of having to 
construct tests using more raw APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to