I have a happy healthy Mesos cluster (0.24) running in my lab.  I've
compiled spark-1.5.0 and it seems to be working fine, except for one small
issue, my tasks all seem to run on one node. (I have 6 in the cluster).

Basically, I have directory of compressed text files.  Compressed, these 25
files add up to 1.2 GB of data, in bin/pyspark I do:

txtfiles = sc.textFile("/path/to/my/data/*")
txtfiles.count()

This goes through and gives me the correct count, but all my tasks (25 of
them) run on one node, let's call it node4.

Interesting.

So I was running spark from node4, but I would have thought it would have
hit up more nodes.

So I ran it on node5.  In executors tab on the spark UI, there is only one
registered, and it's node4, and once again all tasks ran on node4.

I am running in fine grain mode... is there a setting somewhere to allow
for more executors? This seems weird. I've been away from Spark from 1.2.x
but I don't seem to remember this...

Reply via email to