If the file is not splittable(can I assume the log file is splittable,
though) can you advise on how spark handles such caseā¦? If Spark can't what
is the widely used practice?
On 3 Sep 2016 7:29 pm, "Raghavendra Pandey"
wrote:
If your file format is splittable say
If your file format is splittable say TSV, CSV etc, it will be distributed
across all executors.
On Sat, Sep 3, 2016 at 3:38 PM, Somasundaram Sekar <
somasundar.se...@tigeranalytics.com> wrote:
> Hi All,
>
>
>
> Would like to gain some understanding on the questions listed below,
>
>
>
> 1.
Hi All,
Would like to gain some understanding on the questions listed below,
1. When processing a large file with Apache Spark, with, say,
sc.textFile("somefile.xml"), does it split it for parallel processing
across executors or, will it be processed as a single chunk in a single