This is the downloaded docker? Try this with the added configuration options as below
/opt/spark/sbin/start-connect-server.sh *--conf spark.driver.extraJavaOptions="-Divy.cache.dir=/tmp -Divy.home=/tmp" *--packages org.apache.spark:spark-connect_2.12:3.4.1 And you will get starting org.apache.spark.sql.connect.service.SparkConnectServer, logging to /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-9440b0e46cee.out cat /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-9440b0e46cee.out Spark Command: /opt/java/openjdk/bin/java -cp /opt/spark/conf:/opt/spark/jars/* -Xmx1g -Divy.cache.dir=/tmp -Divy.home=/tmp -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/ java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --conf spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp --class org.apache.spark.sql.connect.service.SparkConnectServer --name Spark Connect server --packages org.apache.spark:spark-connect_2.12:3.4.1 spark-internal ======================================== :: loading settings :: url = jar:file:/opt/spark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /tmp The jars for the packages stored in: /tmp/jars org.apache.spark#spark-connect_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-98559fa5-862f-4135-b22c-1fe371e6e8b8;1.0 confs: [default] found org.apache.spark#spark-connect_2.12;3.4.1 in central found org.spark-project.spark#unused;1.0.0 in central :: resolution report :: resolve 118ms :: artifacts dl 4ms :: modules in use: org.apache.spark#spark-connect_2.12;3.4.1 from central in [default] org.spark-project.spark#unused;1.0.0 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 2 | 0 | 0 | 0 || 2 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent-98559fa5-862f-4135-b22c-1fe371e6e8b8 confs: [default] 0 artifacts copied, 2 already retrieved (0kB/3ms) 23/07/22 11:03:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/07/22 11:03:19 INFO SparkConnectServer: Starting Spark session. 23/07/22 11:03:19 INFO SparkContext: Running Spark version 3.4.1 23/07/22 11:03:19 INFO ResourceUtils: ============================================================== 23/07/22 11:03:19 INFO ResourceUtils: No custom resources configured for spark.driver. 23/07/22 11:03:19 INFO ResourceUtils: ============================================================== 23/07/22 11:03:19 INFO SparkContext: Submitted application: Spark Connect server 23/07/22 11:03:19 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 23/07/22 11:03:19 INFO ResourceProfile: Limiting resource is cpu 23/07/22 11:03:19 INFO ResourceProfileManager: Added ResourceProfile id: 0 23/07/22 11:03:19 INFO SecurityManager: Changing view acls to: spark 23/07/22 11:03:19 INFO SecurityManager: Changing modify acls to: spark 23/07/22 11:03:19 INFO SecurityManager: Changing view acls groups to: 23/07/22 11:03:19 INFO SecurityManager: Changing modify acls groups to: 23/07/22 11:03:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY 23/07/22 11:03:19 INFO Utils: Successfully started service 'sparkDriver' on port 34958. 23/07/22 11:03:19 INFO SparkEnv: Registering MapOutputTracker 23/07/22 11:03:19 INFO SparkEnv: Registering BlockManagerMaster 23/07/22 11:03:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 23/07/22 11:03:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 23/07/22 11:03:19 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 23/07/22 11:03:19 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-1f805008-84fc-4c2a-8ea7-b31edcf99cf2 23/07/22 11:03:19 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB 23/07/22 11:03:19 INFO SparkEnv: Registering OutputCommitCoordinator 23/07/22 11:03:19 INFO JettyUtils: Start Jetty 0.0.0.0:4040 for SparkUI 23/07/22 11:03:19 INFO Utils: Successfully started service 'SparkUI' on port 4040. 23/07/22 11:03:19 INFO SparkContext: Added JAR file:///tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar at spark://9440b0e46cee:34958/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO SparkContext: Added JAR file:///tmp/jars/org.spark-project.spark_unused-1.0.0.jar at spark://9440b0e46cee:34958/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO SparkContext: Added file file:///tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar at file:///tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO Utils: Copying /tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.apache.spark_spark-connect_2.12-3.4.1.jar 23/07/22 11:03:19 INFO SparkContext: Added file file:///tmp/jars/org.spark-project.spark_unused-1.0.0.jar at file:///tmp/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO Utils: Copying /tmp/jars/org.spark-project.spark_unused-1.0.0.jar to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.spark-project.spark_unused-1.0.0.jar 23/07/22 11:03:19 INFO Executor: Starting executor ID driver on host 9440b0e46cee 23/07/22 11:03:19 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): '' 23/07/22 11:03:19 INFO Executor: Fetching file:///tmp/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO Utils: /tmp/jars/org.spark-project.spark_unused-1.0.0.jar has been previously copied to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.spark-project.spark_unused-1.0.0.jar 23/07/22 11:03:19 INFO Executor: Fetching file:///tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO Utils: /tmp/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar has been previously copied to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.apache.spark_spark-connect_2.12-3.4.1.jar 23/07/22 11:03:19 INFO Executor: Fetching spark://9440b0e46cee:34958/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO TransportClientFactory: Successfully created connection to 9440b0e46cee/172.17.0.2:34958 after 21 ms (0 ms spent in bootstraps) 23/07/22 11:03:19 INFO Utils: Fetching spark://9440b0e46cee:34958/jars/org.spark-project.spark_unused-1.0.0.jar to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/fetchFileTemp13736159706473790223.tmp 23/07/22 11:03:19 INFO Utils: /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/fetchFileTemp13736159706473790223.tmp has been previously copied to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.spark-project.spark_unused-1.0.0.jar 23/07/22 11:03:19 INFO Executor: Adding file:/tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.spark-project.spark_unused-1.0.0.jar to class loader 23/07/22 11:03:19 INFO Executor: Fetching spark://9440b0e46cee:34958/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar with timestamp 1690023799229 23/07/22 11:03:19 INFO Utils: Fetching spark://9440b0e46cee:34958/jars/org.apache.spark_spark-connect_2.12-3.4.1.jar to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/fetchFileTemp5670012489806930415.tmp 23/07/22 11:03:19 INFO Utils: /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/fetchFileTemp5670012489806930415.tmp has been previously copied to /tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.apache.spark_spark-connect_2.12-3.4.1.jar 23/07/22 11:03:19 INFO Executor: Adding file:/tmp/spark-0455ebe3-e382-4148-8851-ed33b3afe59a/userFiles-a4415f94-8456-4d53-a2f2-5b0cc5a61c0f/org.apache.spark_spark-connect_2.12-3.4.1.jar to class loader 23/07/22 11:03:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39548. 23/07/22 11:03:19 INFO NettyBlockTransferService: Server created on 9440b0e46cee:39548 23/07/22 11:03:19 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 23/07/22 11:03:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 9440b0e46cee, 39548, None) 23/07/22 11:03:19 INFO BlockManagerMasterEndpoint: Registering block manager 9440b0e46cee:39548 with 434.4 MiB RAM, BlockManagerId(driver, 9440b0e46cee, 39548, None) 23/07/22 11:03:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 9440b0e46cee, 39548, None) 23/07/22 11:03:19 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 9440b0e46cee, 39548, None) *23/07/22 11:03:20 INFO SparkConnectServer: Spark Connect server started.* HTH Mich Talebzadeh, Solutions Architect/Engineering Lead Palantir Technologies Limited London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Sat, 22 Jul 2023 at 03:58, Edmondo Porcu <edmondo.po...@gmail.com> wrote: > Hello, > > I am trying to launch Spark connect on Docker Image > > ❯ docker run -it apache/spark:3.4.1-scala2.12-java11-r-ubuntu /bin/bash > spark@aa0a670f7433:/opt/spark/work-dir$ > /opt/spark/sbin/start-connect-server.sh --packages > org.apache.spark:spark-connect_2.12:3.4.1 > starting org.apache.spark.sql.connect.service.SparkConnectServer, logging > to > /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-aa0a670f7433.out > > but the application crashes immediately with a FileNotFound for a specific > xml. > > > spark@aa0a670f7433:/opt/spark/work-dir$ cat > /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-aa0a670f7433.out > Spark Command: /opt/java/openjdk/bin/java -cp > /opt/spark/conf:/opt/spark/jars/* -Xmx1g -XX:+IgnoreUnrecognizedVMOptions > --add-opens=java.base/java.lang=ALL-UNNAMED > --add-opens=java.base/java.lang.invoke=ALL-UNNAMED > --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/ > java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED > --add-opens=java.base/java.nio=ALL-UNNAMED > --add-opens=java.base/java.util=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED > --add-opens=java.base/sun.nio.ch=ALL-UNNAMED > --add-opens=java.base/sun.nio.cs=ALL-UNNAMED > --add-opens=java.base/sun.security.action=ALL-UNNAMED > --add-opens=java.base/sun.util.calendar=ALL-UNNAMED > --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED > -Djdk.reflect.useDirectMethodHandle=false > org.apache.spark.deploy.SparkSubmit --class > org.apache.spark.sql.connect.service.SparkConnectServer --name Spark > Connect server --packages org.apache.spark:spark-connect_2.12:3.4.1 > spark-internal > ======================================== > :: loading settings :: url = > jar:file:/opt/spark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml > Ivy Default Cache set to: /home/spark/.ivy2/cache > The jars for the packages stored in: /home/spark/.ivy2/jars > org.apache.spark#spark-connect_2.12 added as a dependency > :: resolving dependencies :: > org.apache.spark#spark-submit-parent-28c9c405-4607-4625-bacd-23626115e886;1.0 > confs: [default] > Exception in thread "main" java.io.FileNotFoundException: > /home/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-28c9c405-4607-4625-bacd-23626115e886-1.0.xml > (No such file or directory) > at java.base/java.io.FileOutputStream.open0(Native Method) > at java.base/java.io.FileOutputStream.open(Unknown Source) > at java.base/java.io.FileOutputStream.<init>(Unknown Source) > at java.base/java.io.FileOutputStream.<init>(Unknown Source) > at > org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:71) > at > org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:63) > at > org.apache.ivy.core.module.descriptor.DefaultModuleDescriptor.toIvyFile(DefaultModuleDescriptor.java:553) > at > org.apache.ivy.core.cache.DefaultResolutionCacheManager.saveResolvedModuleDescriptor(DefaultResolutionCacheManager.java:184) > at > org.apache.ivy.core.resolve.ResolveEngine.resolve(ResolveEngine.java:259) > at org.apache.ivy.Ivy.resolve(Ivy.java:522) > at > org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1526) > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) > at > org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:332) > at org.apache.spark.deploy.SparkSubmit.org > $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > Is there anything required before launching Spark Connect on the docker > image? > Thanks > Ed >