Re: How do you force a Spark Application to run in multiple tasks

2014-11-17 Thread Daniel Siegmann
I've never used Mesos, sorry. On Fri, Nov 14, 2014 at 5:30 PM, Steve Lewis wrote: > The cluster runs Mesos and I can see the tasks in the Mesos UI but most > are not doing much - any hints about that UI > > On Fri, Nov 14, 2014 at 11:39 AM, Daniel Siegmann < > daniel.siegm...@velos.io> wrote: >

Re: How do you force a Spark Application to run in multiple tasks

2014-11-14 Thread Steve Lewis
The cluster runs Mesos and I can see the tasks in the Mesos UI but most are not doing much - any hints about that UI On Fri, Nov 14, 2014 at 11:39 AM, Daniel Siegmann wrote: > Most of the information you're asking for can be found on the Spark web UI > (see here

Re: How do you force a Spark Application to run in multiple tasks

2014-11-14 Thread Daniel Siegmann
Most of the information you're asking for can be found on the Spark web UI (see here ). You can see which tasks are being processed by which nodes. If you're using HDFS and your file size is smaller than the HDFS block size you will only have one

How do you force a Spark Application to run in multiple tasks

2014-11-14 Thread Steve Lewis
I have instrumented word count to track how many machines the code runs on. I use an accumulator to maintain a Set or MacAddresses. I find that everything is done on a single machine. This is probably optimal for word count but not the larger problems I am working on. How to a force processing to