Re: automatically/dinamically renew aws temporary token

2023-10-23 Thread Pol Santamaria
renew. Best, *Pol Santamaria* On Mon, Oct 23, 2023 at 8:08 AM Jörn Franke wrote: > Can’t you attach the cross account permission to the glue job role? Why > the detour via AssumeRole ? > > Assumerole can make sense if you use an AWS IAM user and STS > authentication, but this would

Re: Spark-SQL - Concurrent Inserts Into Same Table Throws Exception

2023-07-30 Thread Pol Santamaria
that will work better in different use cases according to the writing pattern, type of queries, data characteristics, etc. *Pol Santamaria* On Sat, Jul 29, 2023 at 4:28 PM Mich Talebzadeh wrote: > It is not Spark SQL that throws the error. It is the underlying Database > or layer that

Re: class instance variable in PySpark used in lambda function

2021-12-15 Thread Pol Santamaria
It can be solved in different ways, for instance by avoiding the use of "self" in the map, as you did in the last snippet, or by saving the spark context / session in a different class than "numRows". Bests, Pol Santamaria On Wed, Dec 15, 2021 at 12:24 PM Mich Talebzadeh wrote

Re: Loading Hadoop-Azure in Kubernetes

2021-04-20 Thread Pol Santamaria
Okay! It is a good approach too, I'm happy that you got it working. Cheers, Pol Santamaria On Tue, Apr 20, 2021 at 10:23 AM Nisd wrote: > I ended up with this solution: > https://stackoverflow.com/a/67173899/1020941 > > > > -- > Sent from: http://apache-spark-user-

Re: Loading Hadoop-Azure in Kubernetes

2021-04-16 Thread Pol Santamaria
ativeAzureFileSystem spark.hadoop.fs.AbstractFileSystem.wasb.impl org.apache.hadoop.fs.azure.Wasb Good luck, Pol Santamaria On Fri, Apr 16, 2021 at 3:40 PM Nick Stenroos-Dam wrote: > Hello > > > > I am trying to load the Hadoop-Azure driver in Apache Spark, but so far I > have failed. > > The

Re: NoClassDefFoundError: scala/Product$class

2020-06-03 Thread Pol Santamaria
Hi Charles, I believe Spark 3.0 removed the support for Scala 2.11, and that error is a version compatibility issue. You should try Spark 2.4.5 with your current setup (works with Scala 2.11 by default). Pol Santamaria On Wed, Jun 3, 2020 at 7:44 AM charles_cai <1620075...@qq.com> wrote:

Re: Spark driver thread

2020-03-06 Thread Pol Santamaria
I totally agree with Russell. In my opinion, the best way is to experiment and take measurements. There are different chips, some of them have multithreading, some not, also different system setups... so I'd recommend playing with the 'spark.driver.cores' option. Bests, Pol San

Re: Spark driver thread

2020-03-06 Thread Pol Santamaria
you do computation or I/O on the driver side, you should explore using multithreads and more than 1 vCPU. Bests, Pol Santamaria On Fri, Mar 6, 2020 at 1:28 AM James Yu wrote: > Hi, > > Does a Spark driver always works as single threaded? > > If yes, does it mean asking for more th