Hi all,

I would like to discuss moving Spark Connect server to builtin package.
Right now, users have to specify —packages when they run Spark Connect
server script, for example:

./sbin/start-connect-server.sh --jars `ls
connector/connect/server/target/**/spark-connect*SNAPSHOT.jar`

or

./sbin/start-connect-server.sh --packages
org.apache.spark:spark-connect_2.12:3.5.1

which is a little bit odd that sbin scripts should provide jars to start.

Moving it to builtin package is pretty straightforward because most of jars
are shaded, and the impact would be minimal, I have a prototype here
apache/spark/#47157 <https://github.com/apache/spark/pull/47157>. This also
simplifies Python local running logic a lot.

User facing API layer, Spark Connect Client, stays external but I would
like the internal/admin server layer, Spark Connect Server, implementation
to be built in Spark.

Please let me know if you have thoughts on this!

Reply via email to