Perhaps your RDD is not partitioned enough to utilize all the cores in your
system.

Could you post a simple code snippet and explain what kind of parallelism
you are seeing for it? And can you report on how many partitions your RDDs
have?

On Mon, Oct 20, 2014 at 3:53 PM, Daniel Mahler <dmah...@gmail.com> wrote:

>
> I am launching EC2 clusters using the spark-ec2 scripts.
> My understanding is that this configures spark to use the available
> resources.
> I can see that spark will use the available memory on larger istance types.
> However I have never seen spark running at more than 400% (using 100% on 4
> cores)
> on machines with many more cores.
> Am I misunderstanding the docs? Is it just that high end ec2 instances get
> I/O starved when running spark? It would be strange if that consistently
> produced a 400% hard limit though.
>
> thanks
> Daniel
>

Reply via email to