[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo reopened SPARK-35974: -- > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.8 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if > either 1. i change the commandline to point to local jars on the ec2 OR 2. > use port 7077/client mode instead of cluster mode. But it seems the jar > itself can't be launched off s3, as if the tokens are not being picked up > properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo reopened SPARK-35974: -- v2.4.8 is less than 2 months old > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.8 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if > either 1. i change the commandline to point to local jars on the ec2 OR 2. > use port 7077/client mode instead of cluster mode. But it seems the jar > itself can't be launched off s3, as if the tokens are not being picked up > properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org