Hi,

I am running Spark in the stand alone mode.

1) I have a file of 286MB in HDFS (block size is 64MB) and so is split into
5 blocks. When I have the file in HDFS, 5 tasks are generated and so 5
files in the output. My understanding is that there will be a separate
partition for each block and there will be a separate task for each
partition. This makes sense why I see 5 files in the output.

When I put the same file in local file system (not HDFS), I see 9 files in
the output. I am curious why it is 9?

2) With the file in HDFS and local file system, I see a single
CoarseGrainedExecutorBackend when I run the jps command. Why is it one
executor process and how do we configure the number of executor process?

Thanks,
Praveen

Reply via email to