I looks like I was running into https://issues.apache.org/jira/browse/SPARK-2204 The issues went away when I changed to spark.mesos.coarse.
Kyle On Fri, Jun 20, 2014 at 10:36 AM, Kyle Ellrott <kellr...@soe.ucsc.edu> wrote: > I've tried to parallelize the separate regressions using > allResponses.toParArray.map( x=> do logistic regression against labels in x) > But I start to see messages like > 14/06/20 10:10:26 WARN scheduler.TaskSetManager: Lost TID 4193 (task > 363.0:4) > 14/06/20 10:10:27 WARN scheduler.TaskSetManager: Loss was due to fetch > failure from null > and finally > 14/06/20 10:10:26 ERROR scheduler.TaskSetManager: Task 363.0:4 failed 4 > times; aborting job > > Then > 14/06/20 10:10:26 ERROR scheduler.DAGSchedulerActorSupervisor: > eventProcesserActor failed due to the error null; shutting down SparkContext > 14/06/20 10:10:26 ERROR actor.OneForOneStrategy: > java.lang.UnsupportedOperationException > at > org.apache.spark.scheduler.SchedulerBackend$class.killTask(SchedulerBackend.scala:32) > at > org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend.killTask(MesosSchedulerBackend.scala:41) > at > org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.apply$mcVJ$sp(TaskSchedulerImpl.scala:185) > > > This doesn't happen when I don't use toParArray. I read that spark was > thread safe, but I seem to be running into problems. Am I doing something > wrong? > > Kyle > > > > On Thu, Jun 19, 2014 at 11:21 AM, Kyle Ellrott <kellr...@soe.ucsc.edu> > wrote: > >> >> I'm working on a problem learning several different sets of responses >> against the same set of training features. Right now I've written the >> program to cycle through all of the different label sets, attached them to >> the training data and run LogisticRegressionWithSGD on each of them. ie >> >> foreach curResponseSet in allResponses: >> currentRDD : RDD[LabeledPoints] = curResponseSet joined with >> trainingData >> LogisticRegressionWithSGD.train(currentRDD) >> >> >> Each of the different training runs are independent. It seems like I >> should be parallelize them as well. >> Is there a better way to do this? >> >> >> Kyle >> > >