I use Scala, since I was most familiar with it out of the three languages,
when I started using Spark. I would say that learning Scala with no
functional programming background is somewhat challenging, but well worth
it if you have the time. As others have pointed out, using the REPL and
Hi,
I'm looking for suggestions on the ideal number of executors per machine. I
run my jobs on 64G 32 core machines, and at the moment I have one executor
running per machine, on the spark standalone cluster.
I could not find many guidelines for figuring out the ideal number of
executors; the
Hi,
I keep getting some variation of the following error:
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 2
Does anyone know what this might indicate? Is it a memory issue? Any
general guidance appreciated.