It is the same file and hadoop library that we use for splitting takes care
of assigning the right split to each node.

Prashant Sharma


On Thu, Apr 24, 2014 at 1:36 PM, Carter <gyz...@hotmail.com> wrote:

> Thank you very much for your help Prashant.
>
> Sorry I still have another question about your answer: "however if the
> file("/home/scalatest.txt") is present on the same path on all systems it
> will be processed on all nodes."
>
> When presenting the file to the same path on all nodes, do we just simply
> copy the same file to all nodes, or do we need to split the original file
> into different parts (each part is still with the same file name
> "scalatest.txt"), and copy each part to a different node for
> parallelization?
>
> Thank you very much.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-about-how-hadoop-works-tp4638p4738.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to