Hello,
I have a large data calculation in Spark, distributed across serveral
nodes. In the end, I want to write to a single output file.
For this I do:
output.coalesce(1, false).saveAsTextFile(filename).
What happens is all the data from the workers flows to a single worker, and
that one
Hello,
I have a large data calculation in Spark, distributed across serveral
nodes. In the end, I want to write to a single output file.
For this I do:
output.coalesce(1, false).saveAsTextFile(filename).
What happens is all the data from the workers flows to a single worker, and
that one