Hi Denys, I don't see any issue in your python code, so maybe there is a bug in python wrapper. If it's in scala, I think it should work. BTW, LogsticRegressionWithLBFGS does the standardization internally, so you don't need to do it yourself. It worths giving it a try!
Sincerely, DB Tsai ------------------------------------------------------- Blog: https://www.dbtsai.com On Tue, Apr 21, 2015 at 1:00 AM, Denys Kozyr <dko...@gmail.com> wrote: > Hi! > > I want to normalize features before train logistic regression. I setup scaler: > > scaler2 = StandardScaler(withMean=True, withStd=True).fit(features) > > and apply it to a dataset: > > scaledData = dataset.map(lambda x: LabeledPoint(x.label, > scaler2.transform(Vectors.dense(x.features.toArray() )))) > > but I can't work with scaledData (can't output it or train regression > on it), got an error: > > Exception: It appears that you are attempting to reference SparkContext from > a b > roadcast variable, action, or transforamtion. SparkContext can only be used > on t > he driver, not in code that it run on workers. For more information, see > SPARK-5 > 063. > > Does it correct code to make normalization? Why it doesn't work? > Any advices are welcome. > Thanks. > > Full code: > https://gist.github.com/dkozyr/d31551a3ebed0ee17772 > > Console output: > https://gist.github.com/dkozyr/199f0d4f44cf522f9453 > > Denys > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org