Glad it worked! On Tue, 30 Jul, 2024, 11:12 Ilango, <elango...@gmail.com> wrote:
> > Thanks Prabodh. I copied the spark connect jar to $SPARK_HOME/jars > folder. And passed the location as —jars attr. Its working now. I could > submit spark jobs via spark connect. > > Really appreciate the help. > > > > Thanks, > Elango > > > On Tue, 30 Jul 2024 at 11:05 AM, Prabodh Agarwal <prabodh1...@gmail.com> > wrote: > >> Yeah. I understand the problem. One of the ways is to actually place the >> spark connect jar in the $SPARK_HOME/jars folder. That is how we run spark >> connect. Using the `--packages` or the `--jars` option is flaky in case of >> spark connect. >> >> You can instead manually place the relevant spark connect jar file in the >> `$SPARK_HOME/jars` directory and remove the `--packages` or the `--jars` >> option from your start command. >> >> On Mon, Jul 29, 2024 at 7:01 PM Ilango <elango...@gmail.com> wrote: >> >>> >>> Thanks Prabodh, Yes I can see the spark connect logs in $SPARK_HOME/logs >>> path. It seems like the spark connect dependency issue. My spark node is >>> air gapped node so no internet is allowed. Can I download the spark connect >>> jar and pom files locally and share the local paths? How can I share the >>> local jars ? >>> >>> Error message: >>> >>> :: problems summary :: >>> >>> :::: WARNINGS >>> >>> module not found: >>> org.apache.spark#spark-connect_2.12;3.5.1 >>> >>> >>> >>> ==== local-m2-cache: tried >>> >>> >>> >>> >>> file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom >>> >>> >>> >>> -- artifact >>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: >>> >>> >>> >>> >>> file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar >>> >>> >>> >>> ==== local-ivy-cache: tried >>> >>> >>> >>> >>> /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/ivys/ivy.xml >>> >>> >>> >>> -- artifact >>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: >>> >>> >>> >>> >>> /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/jars/spark-connect_2.12.jar >>> >>> >>> >>> ==== central: tried >>> >>> >>> >>> >>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom >>> >>> >>> >>> -- artifact >>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: >>> >>> >>> >>> >>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar >>> >>> >>> >>> ==== spark-packages: tried >>> >>> >>> >>> >>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom >>> >>> >>> >>> -- artifact >>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: >>> >>> >>> >>> >>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar >>> >>> >>> >>> :::::::::::::::::::::::::::::::::::::::::::::: >>> >>> >>> >>> :: UNRESOLVED DEPENDENCIES :: >>> >>> >>> >>> :::::::::::::::::::::::::::::::::::::::::::::: >>> >>> >>> >>> :: org.apache.spark#spark-connect_2.12;3.5.1: not found >>> >>> >>> >>> :::::::::::::::::::::::::::::::::::::::::::::: >>> >>> >>> >>> >>> >>> :::: ERRORS >>> >>> Server access error at url >>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom >>> (java.net.ConnectException: >>> Connection timed out (Connection timed out)) >>> >>> >>> >>> Server access error at url >>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException: >>> Connection timed out (Connection timed out)) >>> >>> >>> >>> Server access error at url >>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom >>> (java.net.ConnectException: >>> Connection timed out (Connection timed out)) >>> >>> >>> >>> Server access error at url >>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException: >>> Connection timed out (Connection timed out)) >>> >>> >>> >>> >>> >>> :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS >>> >>> Exception in thread "main" java.lang.RuntimeException: [unresolved >>> dependency: org.apache.spark#spark-connect_2.12;3.5.1: not found] >>> >>> at >>> org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1608) >>> >>> at >>> org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334) >>> >>> at org.apache.spark.deploy.SparkSubmit.org >>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) >>> >>> at >>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) >>> >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> >>> >>> >>> >>> >>> Thanks, >>> Elango >>> >>> >>> On Mon, 29 Jul 2024 at 6:45 PM, Prabodh Agarwal <prabodh1...@gmail.com> >>> wrote: >>> >>>> The spark connect startup prints the log location. Is that not feasible >>>> for you? >>>> For me log comes to $SPARK_HOME/logs >>>> >>>> On Mon, 29 Jul, 2024, 15:30 Ilango, <elango...@gmail.com> wrote: >>>> >>>>> >>>>> Hi all, >>>>> >>>>> >>>>> I am facing issues with a Spark Connect application running on a Spark >>>>> standalone cluster (without YARN and HDFS). After executing the >>>>> start-connect-server.sh script with the specified packages, I observe a >>>>> process ID for a short period but am unable to see the corresponding port >>>>> (default 15002) associated with that PID. The process automatically stops >>>>> after around 10 minutes. >>>>> >>>>> Since the Spark History server is not enabled, I am unable to locate >>>>> the relevant logs or error messages. The logs for currently running Spark >>>>> applications are accessible from the Spark UI, but I am unsure where to >>>>> find the logs for the Spark Connect application and service. >>>>> >>>>> Could you please advise on where to find the logs or error messages >>>>> related to Spark Connect? >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Elango >>>>> >>>>