SVMWithSGD.run source code

2014-12-12 Thread Caron
I'm looking at the source code of SVM.scala and trying to find the location
of the source code of the following function:

def train(...): SVMModel = { new SVMWithSGD( ... ).run(input,
initialWeights) }

I'm wondering where I can find the code for SVMWithSGD().run()?
I'd like to see the implementation of the function run().

Thanks!

Caron





-
Thanks!
-Caron
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-run-source-code-tp20671.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



SOLVED -- Re: scopt.OptionParser

2014-12-08 Thread Caron
Update:

The issue in my previous post was solved:

I had to change the sbt file name from project_name.sbt to build.sbt.
 



-
Thanks!
-Caron
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/scopt-OptionParser-tp8436p20581.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: SVMWithSGD default threshold

2014-11-12 Thread Caron
Sean,

Thanks a lot for your reply!

A few follow up questions:
1. numIterations should be 100, not 100*trainingSetSize, right?
2. My training set has 90k positive data points (with label 1) and 60k
negative data points (with label 0).
I set my numIterations to 100 as default. I still got the same predication
result: it all predicted to label 1.
And I'm sure my dataset is linearly separable because it has been run on
other frameworks like scikit-learn.

// code
val numIterations = 100;
val regParam = 1
val svm = new SVMWithSGD()
svm.optimizer.setNumIterations(numIterations).setRegParam(regParam)
svm.setIntercept(true)  
val model = svm.run(training)








-
Thanks!
-Caron
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645p18741.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



SVMWithSGD default threshold

2014-11-11 Thread Caron
I'm hoping to get a linear classifier on a dataset.
I'm using SVMWithSGD to train the data.
After running with the default options: val model =
SVMWithSGD.train(training, numIterations), 
I don't think SVM has done the classification correctly.

My observations:
1. the intercept is always 0.0
2. the predicted labels are ALL 1's, no 0's.

My questions are:
1. what should the numIterations be? I tried to set it to
10*trainingSetSize, is that sufficient?
2. since MLlib only accepts data with labels 0 or 1, shouldn't the
default threshold for SVMWithSGD be 0.5 instead of 0.0?
3. It seems counter-intuitive to me to have the default intercept be 0.0,
meaning the line has to go through the origin.
4. Does Spark MLlib provide an API to do grid search like scikit-learn does?

Any help would be greatly appreciated!




-
Thanks!
-Caron
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SVMWithSGD-default-threshold-tp18645.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org