[ 
https://issues.apache.org/jira/browse/SPARK-45571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liran Y updated SPARK-45571:
----------------------------
    Description: 
I am starting a spark connect server in k8s.

While trying to access S3 I'm getting the following error from the executor
{code:java}
Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most 
recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.1.0.174 executor 1): 
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.s3a.S3AFileSystem not found
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)
        at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
... {code}
The driver is able to access S3 without trouble.

I tried adding the hadoop-aws jar in multiple ways: --packcages, --jars, 
SPARK_EXTRA_CLASSPATH env and adding to the spark Jars folder in my dockerfile.
 
When looking in my executor pod, I'm seeing the classpath is set up properly 
and should have access to the jar located in multiple places.

 

Only by using the addArtifact API I can add the missing jar.

  was:
I am starting a spark connect server in k8s.

While trying to access S3 I'm getting the following error from the executor
{code:java}
Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most 
recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.1.0.174 executor 1): 
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.s3a.S3AFileSystem not found
        at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)
        at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
... {code}
The driver is able to access S3 without trouble.

I tried adding the hadoop-aws jar in multiple ways: --packcages, --jars, 
SPARK_EXTRA_CLASSPATH env and adding to the spark Jars folder in my dockerfile.
 
When looking in my executor pod, I'm seeing the classpath is set up properly 
and should have access to the jar located in multiple places.


> Spark connect executor ignore jars in classpath
> -----------------------------------------------
>
>                 Key: SPARK-45571
>                 URL: https://issues.apache.org/jira/browse/SPARK-45571
>             Project: Spark
>          Issue Type: Bug
>          Components: Connect
>    Affects Versions: 3.5.0
>            Reporter: Liran Y
>            Priority: Major
>
> I am starting a spark connect server in k8s.
> While trying to access S3 I'm getting the following error from the executor
> {code:java}
> Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most 
> recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.1.0.174 executor 1): 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>       at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)
>       at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
> ... {code}
> The driver is able to access S3 without trouble.
> I tried adding the hadoop-aws jar in multiple ways: --packcages, --jars, 
> SPARK_EXTRA_CLASSPATH env and adding to the spark Jars folder in my 
> dockerfile.
>  
> When looking in my executor pod, I'm seeing the classpath is set up properly 
> and should have access to the jar located in multiple places.
>  
> Only by using the addArtifact API I can add the missing jar.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to