SVMWithSGD.run source code
I'm looking at the source code of SVM.scala and trying to find the location of the source code of the following function: def train(...): SVMModel = { new SVMWithSGD( ... ).run(input, initialWeights) } I'm wondering where I can find the code for SVMWithSGD().run()? I'd like to see the implementation of the function run(). Thanks! Caron - Thanks! -Caron -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-run-source-code-tp20671.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
SOLVED -- Re: scopt.OptionParser
Update: The issue in my previous post was solved: I had to change the sbt file name from project_name.sbt to build.sbt. - Thanks! -Caron -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/scopt-OptionParser-tp8436p20581.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SVMWithSGD default threshold
Sean, Thanks a lot for your reply! A few follow up questions: 1. numIterations should be 100, not 100*trainingSetSize, right? 2. My training set has 90k positive data points (with label 1) and 60k negative data points (with label 0). I set my numIterations to 100 as default. I still got the same predication result: it all predicted to label 1. And I'm sure my dataset is linearly separable because it has been run on other frameworks like scikit-learn. // code val numIterations = 100; val regParam = 1 val svm = new SVMWithSGD() svm.optimizer.setNumIterations(numIterations).setRegParam(regParam) svm.setIntercept(true) val model = svm.run(training) - Thanks! -Caron -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645p18741.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
SVMWithSGD default threshold
I'm hoping to get a linear classifier on a dataset. I'm using SVMWithSGD to train the data. After running with the default options: val model = SVMWithSGD.train(training, numIterations), I don't think SVM has done the classification correctly. My observations: 1. the intercept is always 0.0 2. the predicted labels are ALL 1's, no 0's. My questions are: 1. what should the numIterations be? I tried to set it to 10*trainingSetSize, is that sufficient? 2. since MLlib only accepts data with labels 0 or 1, shouldn't the default threshold for SVMWithSGD be 0.5 instead of 0.0? 3. It seems counter-intuitive to me to have the default intercept be 0.0, meaning the line has to go through the origin. 4. Does Spark MLlib provide an API to do grid search like scikit-learn does? Any help would be greatly appreciated! - Thanks! -Caron -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org