I'm working on a problem learning several different sets of responses against the same set of training features. Right now I've written the program to cycle through all of the different label sets, attached them to the training data and run LogisticRegressionWithSGD on each of them. ie
foreach curResponseSet in allResponses: currentRDD : RDD[LabeledPoints] = curResponseSet joined with trainingData LogisticRegressionWithSGD.train(currentRDD) Each of the different training runs are independent. It seems like I should be parallelize them as well. Is there a better way to do this? Kyle