Hi,

What's "<all other stuff>"? What master URL do you use?

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Tue, Jul 26, 2016 at 2:18 AM, Mail.com <pradeep.mi...@mail.com> wrote:
> Hi All,
>
> I have a directory which has 12 files. I want to read the entire file so I am 
> reading it as wholeTextFiles(dirpath, numPartitions).
>
> I run spark-submit as <all other stuff> --num-executors 12 --executor-cores 1 
> and numPartitions 12.
>
> However, when I run the job I see that the stage which reads the directory 
> has only 8 tasks. So some task reads more than one file and takes twice the 
> time.
>
> What can I do that the files are read by 12 tasks  I.e one file per task.
>
> Thanks,
> Pradeep
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to