Re: automatically/dinamically renew aws temporary token

2023-10-23 Thread Pol Santamaria
Hi Carlos!

Take a look at this project, it's 6 years old but the approach is still
valid:

https://github.com/zillow/aws-custom-credential-provider

The credential provider gets called each time an S3 or Glue Catalog is
accessed, and then you can decide whether to use a cached token or renew.

Best,

*Pol Santamaria*


On Mon, Oct 23, 2023 at 8:08 AM Jörn Franke  wrote:

> Can’t you attach the cross account permission to the glue job role? Why
> the detour via AssumeRole ?
>
> Assumerole can make sense if you use an AWS IAM user and STS
> authentication, but this would make no sense within AWS for cross-account
> access as attaching the permissions to the Glue job role is more secure (no
> need for static credentials, automatically renew permissions in shorter
> time without any specific configuration in Spark).
>
> Have you checked with AWS support?
>
> Am 22.10.2023 um 21:14 schrieb Carlos Aguni :
>
> 
> hi all,
>
> i've a scenario where I need to assume a cross account role to have S3
> bucket access.
>
> the problem is that this role only allows for 1h time span (no
> negotiation).
>
> that said.
> does anyone know a way to tell spark to automatically renew the token
> or to dinamically renew the token on each node?
> i'm currently using spark on AWS glue.
>
> wonder what options do I have.
>
> regards,c.
>
>


Maximum executors in EC2 Machine

2023-10-23 Thread KhajaAsmath Mohammed
Hi,

I am running a spark job in spark EC2 machine whiich has 40 cores. Driver
and executor memory is 16 GB. I am using local[*] but I still get only one
executor(driver). Is there a way to get more executors with this config.

I am not using yarn or mesos in this case. Only one machine which is enough
for our work load but the data is increased.

Thanks,
Asmath


Contribution Recommendations

2023-10-23 Thread Phil Dakin
Per the "Contributing to Spark "
guide, I am requesting guidance on selecting a good ticket to take on. I've
opened documentation/test PRs:

https://github.com/apache/spark/pull/43369
https://github.com/apache/spark/pull/43405

If you have recommendations for the next ticket, please let me know. The
"starter" tag in Jira is sparsely populated.


submitting tasks failed in Spark standalone mode due to missing failureaccess jar file

2023-10-23 Thread eab...@163.com
Hi Team.
I use spark 3.5.0 to start Spark cluster with start-master.sh and 
start-worker.sh, when I use  ./bin/spark-shell --master 
spark://LAPTOP-TC4A0SCV.:7077 and get error logs: 
```
23/10/24 12:00:46 ERROR TaskSchedulerImpl: Lost an executor 1 (already 
removed): Command exited with code 50
```
  The worker finished executors  logs:
```
Spark Executor Command: 
"/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.372.b07-1.el7_9.x86_64/jre/bin/java" 
"-cp" 
"/root/spark-3.5.0-bin-hadoop3/conf/:/root/spark-3.5.0-bin-hadoop3/jars/*" 
"-Xmx1024M" "-Dspark.driver.port=43765" "-Djava.net.preferIPv6Addresses=false" 
"-XX:+IgnoreUnrecognizedVMOptions" 
"--add-opens=java.base/java.lang=ALL-UNNAMED" 
"--add-opens=java.base/java.lang.invoke=ALL-UNNAMED" 
"--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" 
"--add-opens=java.base/java.io=ALL-UNNAMED" 
"--add-opens=java.base/java.net=ALL-UNNAMED" 
"--add-opens=java.base/java.nio=ALL-UNNAMED" 
"--add-opens=java.base/java.util=ALL-UNNAMED" 
"--add-opens=java.base/java.util.concurrent=ALL-UNNAMED" 
"--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED" 
"--add-opens=java.base/sun.nio.ch=ALL-UNNAMED" 
"--add-opens=java.base/sun.nio.cs=ALL-UNNAMED" 
"--add-opens=java.base/sun.security.action=ALL-UNNAMED" 
"--add-opens=java.base/sun.util.calendar=ALL-UNNAMED" 
"--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" 
"-Djdk.reflect.useDirectMethodHandle=false" 
"org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
"spark://CoarseGrainedScheduler@172.29.190.147:43765" "--executor-id" "0" 
"--hostname" "172.29.190.147" "--cores" "6" "--app-id" 
"app-20231024120037-0001" "--worker-url" "spark://Worker@172.29.190.147:34707" 
"--resourceProfileId" "0"

Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
23/10/24 12:00:39 INFO CoarseGrainedExecutorBackend: Started daemon with 
process name: 19535@LAPTOP-TC4A0SCV
23/10/24 12:00:39 INFO SignalUtils: Registering signal handler for TERM
23/10/24 12:00:39 INFO SignalUtils: Registering signal handler for HUP
23/10/24 12:00:39 INFO SignalUtils: Registering signal handler for INT
23/10/24 12:00:39 WARN Utils: Your hostname, LAPTOP-TC4A0SCV resolves to a 
loopback address: 127.0.1.1; using 172.29.190.147 instead (on interface eth0)
23/10/24 12:00:39 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another 
address
23/10/24 12:00:42 INFO CoarseGrainedExecutorBackend: Successfully registered 
with driver
23/10/24 12:00:42 INFO Executor: Starting executor ID 0 on host 172.29.190.147
23/10/24 12:00:42 INFO Executor: OS info Linux, 
5.15.123.1-microsoft-standard-WSL2, amd64
23/10/24 12:00:42 INFO Executor: Java version 1.8.0_372
23/10/24 12:00:42 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 35227.
23/10/24 12:00:42 INFO NettyBlockTransferService: Server created on 
172.29.190.147:35227
23/10/24 12:00:42 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
23/10/24 12:00:42 ERROR Inbox: An error happened while processing message in 
the inbox for Executor
java.lang.NoClassDefFoundError: 
org/sparkproject/guava/util/concurrent/internal/InternalFutureFailureAccess
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.ClassLoader.d