Re: [Spark Core] makeRDD() preferredLocations do not appear to be considered

2020-09-12 Thread Tom Scott
It turned out the issue was with my environment not Spark. Just in case anyone else is experiencing this the problem was that the Spark workers did not use the machine hostname by default. Setting the following environment variable on each worker rectified it: SPARK_LOCAL_HOSTNAME: "worker1" etc.

[Spark Core] makeRDD() preferredLocations do not appear to be considered

2020-09-08 Thread Tom Scott
Hi Guys, I asked this in stack overflow here: https://stackoverflow.com/questions/63535720/why-would-preferredlocations-not-be-enforced-on-an-empty-spark-cluster but am hoping there is further help here. I have a 4 node standalone cluster with workers named worker1, worker2 and worker3 and a