Hi All,

I have a directory which has 12 files. I want to read the entire file so I am 
reading it as wholeTextFiles(dirpath, numPartitions).

I run spark-submit as <all other stuff> --num-executors 12 --executor-cores 1 
and numPartitions 12.

However, when I run the job I see that the stage which reads the directory has 
only 8 tasks. So some task reads more than one file and takes twice the time.

What can I do that the files are read by 12 tasks  I.e one file per task.

Thanks,
Pradeep

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to