spark schema conflict behavior records being silently dropped

2023-10-24 Thread Carlos Aguni
hi all, i noticed a weird behavior to when spark parses nested json with schema conflict. i also just noticed that spark "fixed" this in the most recent release 3.5.0 but since i'm working with AWS services being: * EMR 6: spark 3.3.* spark3.4.* * Glue 3: spark3.1.1 * Glue 4: spark 3.3.0

Re: automatically/dinamically renew aws temporary token

2023-10-24 Thread Carlos Aguni
hi all, thank you for your reply. > Can’t you attach the cross account permission to the glue job role? Why the detour via AssumeRole ? yes Jorn, i also believe this is the best approach. but here we're dealing with company policies and all the bureaucracy that comes along. in parallel i'm

Re: Maximum executors in EC2 Machine

2023-10-24 Thread Riccardo Ferrari
Hi, I would refer to their documentation to better understand the concepts behind cluster overview and submitting applications: - https://spark.apache.org/docs/latest/cluster-overview.html#cluster-manager-types - https://spark.apache.org/docs/latest/submitting-applications.html When

submitting tasks failed in Spark standalone mode due to missing failureaccess jar file

2023-10-24 Thread eab...@163.com
Hi Team. I use spark 3.5.0 to start Spark cluster with start-master.sh and start-worker.sh, when I use ./bin/spark-shell --master spark://LAPTOP-TC4A0SCV.:7077 and get error logs: ``` 23/10/24 12:00:46 ERROR TaskSchedulerImpl: Lost an executor 1 (already removed): Command exited with code