On 18 Dec 2016, at 19:50, joa...@verona.se<mailto:joa...@verona.se> wrote:
Since each Spark worker node needs to access the same files, we have tried using Hdfs. This worked, but there were some oddities making me a bit uneasy. For dependency hell reasons I compiled a modified Spark, and this version exhibited the odd behaviour with Hdfs. The problem might have nothing to do with Hdfs, but the situation made me curious about the alternatives. what were the oddities?