OK, it's not class imbalance. Yes, 100 iterations.
My other guess is that the stepSize of 1 is way too big for your data.

I'd suggest you look at the weights / intercept of the resulting model to
see if it makes any sense.

You can call clearThreshold on the model, and then it will 'predict' the
SVM margin instead of a class. That could at least tell you whether it's
predicting the same value over and over or just lots of very big values.

On Wed, Nov 12, 2014 at 6:02 PM, Caron <caron.big...@gmail.com> wrote:

> Sean,
>
> Thanks a lot for your reply!
>
> A few follow up questions:
> 1. numIterations should be 100, not 100*trainingSetSize, right?
> 2. My training set has 90k positive data points (with label 1) and 60k
> negative data points (with label 0).
> I set my numIterations to 100 as default. I still got the same predication
> result: it all predicted to label 1.
> And I'm sure my dataset is linearly separable because it has been run on
> other frameworks like scikit-learn.
>
> // code
> val numIterations = 100;
> val regParam = 1
> val svm = new SVMWithSGD()
> svm.optimizer.setNumIterations(numIterations).setRegParam(regParam)
> svm.setIntercept(true)
> val model = svm.run(training)
>
>
>
>
>
>
>
>
> -----
> Thanks!
> -Caron
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645p18741.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to