I'm hoping to get a linear classifier on a dataset.
I'm using SVMWithSGD to train the data.
After running with the default options: val model =
SVMWithSGD.train(training, numIterations), 
I don't think SVM has done the classification correctly.

My observations:
1. the intercept is always 0.0
2. the predicted labels are ALL 1's, no 0's.

My questions are:
1. what should the numIterations be? I tried to set it to
10*trainingSetSize, is that sufficient?
2. since MLlib only accepts data with labels "0" or "1", shouldn't the
default threshold for SVMWithSGD be 0.5 instead of 0.0?
3. It seems counter-intuitive to me to have the default intercept be 0.0,
meaning the line has to go through the origin.
4. Does Spark MLlib provide an API to do grid search like scikit-learn does?

Any help would be greatly appreciated!




-----
Thanks!
-Caron
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to