Hi Denys,

I don't see any issue in your python code, so maybe there is a bug in
python wrapper. If it's in scala, I think it should work. BTW,
LogsticRegressionWithLBFGS does the standardization internally, so you
don't need to do it yourself. It worths giving it a try!

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Tue, Apr 21, 2015 at 1:00 AM, Denys Kozyr <dko...@gmail.com> wrote:
> Hi!
>
> I want to normalize features before train logistic regression. I setup scaler:
>
> scaler2 = StandardScaler(withMean=True, withStd=True).fit(features)
>
> and apply it to a dataset:
>
> scaledData = dataset.map(lambda x: LabeledPoint(x.label,
> scaler2.transform(Vectors.dense(x.features.toArray() ))))
>
> but I can't work with scaledData (can't output it or train regression
> on it), got an error:
>
> Exception: It appears that you are attempting to reference SparkContext from 
> a b
> roadcast variable, action, or transforamtion. SparkContext can only be used 
> on t
> he driver, not in code that it run on workers. For more information, see 
> SPARK-5
> 063.
>
> Does it correct code to make normalization? Why it doesn't work?
> Any advices are welcome.
> Thanks.
>
> Full code:
> https://gist.github.com/dkozyr/d31551a3ebed0ee17772
>
> Console output:
> https://gist.github.com/dkozyr/199f0d4f44cf522f9453
>
> Denys
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to