spark+mesos: configure mesos 'callback' port?

2014-05-15 Thread Scott Clasen
Is anyone aware of a way to configure the mesos GroupProcess port on the mesos slave/task which the mesos master calls back on? The log line that shows this port looks like below (mesos 0.17.0) I0507 02:37:20.893334 11638 group.cpp:310] Group process ((2)@1.2.3.4:54321) connected to ZooKeeper.

Re: is Mesos falling out of favor?

2014-05-15 Thread Scott Clasen
curious what the bug is and what it breaks? I have spark 0.9.0 running on mesos 0.17.0 and seems to work correctly. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/is-Mesos-falling-out-of-favor-tp5444p5483.html Sent from the Apache Spark User List mailing

Comprehensive Port Configuration reference?

2014-05-05 Thread Scott Clasen
Is there somewhere documented how one would go about configuring every open port a spark application needs? This seems like one of the main things that make running spark hard in places like EC2 where you arent using the canned spark scripts. Starting an app looks like you'll see ports open for

Re: Spark Streaming + Kafka + Mesos/Marathon strangeness

2014-03-27 Thread Scott Clasen
I think now that this is because spark.local.dir is defaulting to /tmp, and since the tasks are not running on the same machine, the file is not found when the second task takes over. How do you set spark.local.dir appropriately when running on mesos? -- View this message in context:

KafkaInputDStream mapping of partitions to tasks

2014-03-27 Thread Scott Clasen
I have a simple streaming job that creates a kafka input stream on a topic with 8 partitions, and does a forEachRDD The job and tasks are running on mesos, and there are two tasks running, but only 1 task doing anything. I also set spark.streaming.concurrentJobs=8 but still there is only 1 task

Re: Spark Streaming + Kafka + Mesos/Marathon strangeness

2014-03-27 Thread Scott Clasen
Heh sorry that wasnt a clear question, I know 'how' to set it but dont know what value to use in a mesos cluster, since the processes are running in lxc containers they wont be sharing a filesystem (or machine for that matter) I cant use an s3n:// url for local dir can I? -- View this

Re: KafkaInputDStream mapping of partitions to tasks

2014-03-27 Thread Scott Clasen
Actually looking closer it is stranger than I thought, in the spark UI, one executor has executed 4 tasks, and one has executed 1928 Can anyone explain the workings of a KafkaInputStream wrt kafka partitions and mapping to spark executors and tasks? -- View this message in context:

Re: KafkaInputDStream mapping of partitions to tasks

2014-03-27 Thread Scott Clasen
Evgeniy Shishkin wrote So, at the bottom — kafka input stream just does not work. That was the conclusion I was coming to as well. Are there open tickets around fixing this up? -- View this message in context:

Re: KafkaInputDStream mapping of partitions to tasks

2014-03-27 Thread Scott Clasen
Thanks everyone for the discussion. Just to note, I restarted the job yet again, and this time there are indeed tasks being executed by both worker nodes. So the behavior does seem inconsistent/broken atm. Then I added a third node to the cluster, and a third executor came up, and everything

Re: Spark Streaming + Kafka + Mesos/Marathon strangeness

2014-03-26 Thread Scott Clasen
The web-ui shows 3 executors, the driver and one spark task on each worker. I do see that there were 8 successful tasks and the ninth failed like so... java.lang.Exception (java.lang.Exception: Could not compute split, block input-0-1395860790200 not found)