Also, if not already done,  you may want to try repartition your data to 50
partition s
On 6 May 2015 05:56, "Manu Kaul" <manohar.k...@gmail.com> wrote:

> Hi All,
> For a job I am running on Spark with a dataset of say 350,000 lines (not
> big), I am finding that even though my cluster has a large number of cores
> available (like 100 cores), the Spark system seems to stop after using just
> 4 cores and after that the runtime is pretty much a straight line no matter
> how many more cores are thrown at it. I am wondering if Spark tries to
> figure out the maximum no. of cores to use based on the size of the
> dataset? If yes, is there a way to disable this feature and force it to use
> all the cores available?
>
> Thanks,
> Manu
>
> --
>
> The greater danger for most of us lies not in setting our aim too high and
> falling short; but in setting our aim too low, and achieving our mark.
> - Michelangelo
>

Reply via email to