Hello there, I'm having some trouble with my spark-cluster consisting of
master.censored.dev and spark-worker-0 Reading from the output of pyspark, master, and worker-node it seems like the cluster is formed correctly and pyspark connects to it. But for some reason, nothing happens after "TaskSchedulerImpl: Adding task set". Why is this and how can I investigate it further? I haven't really seen any clues in the web-ui. The program output is as follows: pypark: https://gist.githubusercontent.com/PureW/ebe1b95b9b4814fc2533/raw/e2d08b7b6288afad3cb03238acc3d172291166d8/pyspark+log master: https://gist.githubusercontent.com/PureW/9889bc9b57a8406599df/raw/4b1faeda8bacff06b5c3a32d75e74ef114933504/Spark-master worker: https://gist.githubusercontent.com/PureW/7451cd5ed6780f4d1e33/raw/f45971bd1e6cba620db566998a9afd035ea8d529/spark-worker The code I am running through pyspark can be seen at https://gist.github.com/PureW/2c9603bdf1ef2ae772f3 When the worker-node couldn't access the data, it raised an exception, but now there's nothing at all. I've run the code locally and it only takes ~15s to finish. Thanks for any help! /Anders