Glad it worked!

On Tue, 30 Jul, 2024, 11:12 Ilango, <elango...@gmail.com> wrote:

>
> Thanks Prabodh. I copied the spark connect jar to  $SPARK_HOME/jars
> folder.  And passed the location as —jars attr. Its working now. I could
> submit spark jobs via spark connect.
>
> Really appreciate the help.
>
>
>
> Thanks,
> Elango
>
>
> On Tue, 30 Jul 2024 at 11:05 AM, Prabodh Agarwal <prabodh1...@gmail.com>
> wrote:
>
>> Yeah. I understand the problem. One of the ways is to actually place the
>> spark connect jar in the $SPARK_HOME/jars folder. That is how we run spark
>> connect. Using the `--packages` or the `--jars` option is flaky in case of
>> spark connect.
>>
>> You can instead manually place the relevant spark connect jar file in the
>> `$SPARK_HOME/jars` directory and remove the `--packages` or the `--jars`
>> option from your start command.
>>
>> On Mon, Jul 29, 2024 at 7:01 PM Ilango <elango...@gmail.com> wrote:
>>
>>>
>>> Thanks Prabodh, Yes I can see the spark connect logs in $SPARK_HOME/logs
>>> path. It seems like the spark connect dependency issue. My spark node is
>>> air gapped node so no internet is allowed. Can I download the spark connect
>>> jar and pom files locally and share the local paths? How can I share the
>>> local jars ?
>>>
>>> Error message:
>>>
>>> :: problems summary ::
>>>
>>> :::: WARNINGS
>>>
>>>                 module not found:
>>> org.apache.spark#spark-connect_2.12;3.5.1
>>>
>>>
>>>
>>>         ==== local-m2-cache: tried
>>>
>>>
>>>
>>>
>>> file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
>>>
>>>
>>>
>>>           -- artifact
>>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
>>>
>>>
>>>
>>>
>>> file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
>>>
>>>
>>>
>>>         ==== local-ivy-cache: tried
>>>
>>>
>>>
>>>
>>> /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/ivys/ivy.xml
>>>
>>>
>>>
>>>           -- artifact
>>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
>>>
>>>
>>>
>>>
>>> /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/jars/spark-connect_2.12.jar
>>>
>>>
>>>
>>>         ==== central: tried
>>>
>>>
>>>
>>>
>>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
>>>
>>>
>>>
>>>           -- artifact
>>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
>>>
>>>
>>>
>>>
>>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
>>>
>>>
>>>
>>>         ==== spark-packages: tried
>>>
>>>
>>>
>>>
>>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
>>>
>>>
>>>
>>>           -- artifact
>>> org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
>>>
>>>
>>>
>>>
>>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
>>>
>>>
>>>
>>>                 ::::::::::::::::::::::::::::::::::::::::::::::
>>>
>>>
>>>
>>>                 ::          UNRESOLVED DEPENDENCIES         ::
>>>
>>>
>>>
>>>                 ::::::::::::::::::::::::::::::::::::::::::::::
>>>
>>>
>>>
>>>                 :: org.apache.spark#spark-connect_2.12;3.5.1: not found
>>>
>>>
>>>
>>>                 ::::::::::::::::::::::::::::::::::::::::::::::
>>>
>>>
>>>
>>>
>>>
>>> :::: ERRORS
>>>
>>>         Server access error at url
>>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
>>>  (java.net.ConnectException:
>>> Connection timed out (Connection timed out))
>>>
>>>
>>>
>>>         Server access error at url
>>> https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException:
>>> Connection timed out (Connection timed out))
>>>
>>>
>>>
>>>         Server access error at url
>>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
>>>  (java.net.ConnectException:
>>> Connection timed out (Connection timed out))
>>>
>>>
>>>
>>>         Server access error at url
>>> https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException:
>>> Connection timed out (Connection timed out))
>>>
>>>
>>>
>>>
>>>
>>> :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
>>>
>>> Exception in thread "main" java.lang.RuntimeException: [unresolved
>>> dependency: org.apache.spark#spark-connect_2.12;3.5.1: not found]
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1608)
>>>
>>>         at
>>> org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334)
>>>
>>>         at org.apache.spark.deploy.SparkSubmit.org
>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
>>>
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
>>>
>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>> Elango
>>>
>>>
>>> On Mon, 29 Jul 2024 at 6:45 PM, Prabodh Agarwal <prabodh1...@gmail.com>
>>> wrote:
>>>
>>>> The spark connect startup prints the log location. Is that not feasible
>>>> for you?
>>>> For me log comes to $SPARK_HOME/logs
>>>>
>>>> On Mon, 29 Jul, 2024, 15:30 Ilango, <elango...@gmail.com> wrote:
>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>>
>>>>> I am facing issues with a Spark Connect application running on a Spark
>>>>> standalone cluster (without YARN and HDFS). After executing the
>>>>> start-connect-server.sh script with the specified packages, I observe a
>>>>> process ID for a short period but am unable to see the corresponding port
>>>>> (default 15002) associated with that PID. The process automatically stops
>>>>> after around 10 minutes.
>>>>>
>>>>> Since the Spark History server is not enabled, I am unable to locate
>>>>> the relevant logs or error messages. The logs for currently running Spark
>>>>> applications are accessible from the Spark UI, but I am unsure where to
>>>>> find the logs for the Spark Connect application and service.
>>>>>
>>>>> Could you please advise on where to find the logs or error messages
>>>>> related to Spark Connect?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Elango
>>>>>
>>>>

Reply via email to