Hello, In general, I am usually able to run spark submit jobs in local mode, in a 32-cores node with plenty of memory ram. The performance is significantly faster in local mode than when using a cluster of spark workers.
How can this be explained and what measures can one take in order to improve such performance? Usually a job that takes 35 seconds in local mode takes around 48 seconds in a small cluster. Thanks, Saif