​Hi,

Is there a way to read a large file, in parallel​/distributed way? I have a
single large binary file which I currently read on the driver program and
then distribute it to executors (using groupBy(), etc.). I want to know if
there's a way to make the executors each read a specific/unique portion of
the file or create RDDs of multiple portions of the file and finally union
them.

Thanks.

Reply via email to