I looks like I was running into
https://issues.apache.org/jira/browse/SPARK-2204
The issues went away when I changed to spark.mesos.coarse.

Kyle


On Fri, Jun 20, 2014 at 10:36 AM, Kyle Ellrott <kellr...@soe.ucsc.edu>
wrote:

> I've tried to parallelize the separate regressions using
> allResponses.toParArray.map( x=> do logistic regression against labels in x)
> But I start to see messages like
> 14/06/20 10:10:26 WARN scheduler.TaskSetManager: Lost TID 4193 (task
> 363.0:4)
> 14/06/20 10:10:27 WARN scheduler.TaskSetManager: Loss was due to fetch
> failure from null
> and finally
> 14/06/20 10:10:26 ERROR scheduler.TaskSetManager: Task 363.0:4 failed 4
> times; aborting job
>
> Then
> 14/06/20 10:10:26 ERROR scheduler.DAGSchedulerActorSupervisor:
> eventProcesserActor failed due to the error null; shutting down SparkContext
> 14/06/20 10:10:26 ERROR actor.OneForOneStrategy:
> java.lang.UnsupportedOperationException
> at
> org.apache.spark.scheduler.SchedulerBackend$class.killTask(SchedulerBackend.scala:32)
> at
> org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend.killTask(MesosSchedulerBackend.scala:41)
>  at
> org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.apply$mcVJ$sp(TaskSchedulerImpl.scala:185)
>
>
> This doesn't happen when I don't use toParArray. I read that spark was
> thread safe, but I seem to be running into problems. Am I doing something
> wrong?
>
> Kyle
>
>
>
> On Thu, Jun 19, 2014 at 11:21 AM, Kyle Ellrott <kellr...@soe.ucsc.edu>
> wrote:
>
>>
>> I'm working on a problem learning several different sets of responses
>> against the same set of training features. Right now I've written the
>> program to cycle through all of the different label sets, attached them to
>> the training data and run LogisticRegressionWithSGD on each of them. ie
>>
>> foreach curResponseSet in allResponses:
>>      currentRDD : RDD[LabeledPoints] = curResponseSet joined with
>> trainingData
>>      LogisticRegressionWithSGD.train(currentRDD)
>>
>>
>> Each of the different training runs are independent. It seems like I
>> should be parallelize them as well.
>> Is there a better way to do this?
>>
>>
>> Kyle
>>
>
>

Reply via email to