Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-31 Thread DB Tsai
Alternatively, I will give a talk about LOR and LIR with elastic-net implementation and interpretation of those models in spark summit. https://spark-summit.org/2015/events/large-scale-lasso-and-elastic-net-regularized-generalized-linear-models/ You may attend or watch online. Sincerely, DB

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-30 Thread ayan guha
I hope they will come up with1.4 before spark summit in mid June On 31 May 2015 10:07, Joseph Bradley jos...@databricks.com wrote: Spark 1.4 should be available next month, but I'm not sure about the exact date. Your interpretation of high lambda is reasonable. High lambda is really

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-30 Thread Joseph Bradley
Spark 1.4 should be available next month, but I'm not sure about the exact date. Your interpretation of high lambda is reasonable. High lambda is really data-dependent. lambda is the same as the regParam in Spark, available in all recent Spark versions. On Fri, May 29, 2015 at 5:35 AM, mélanie

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-29 Thread mélanie gallois
When will Spark 1.4 be available exactly? To answer to Model selection can be achieved through high lambda resulting lots of zero in the coefficients : Do you mean that putting a high lambda as a parameter of the logistic regression keeps only a few significant variables and deletes the others

MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread SparknewUser
I am new in MLlib and in Spark.(I use Scala) I'm trying to understand how LogisticRegressionWithLBFGS and LogisticRegressionWithSGD work. I usually use R to do logistic regressions but now I do it on Spark to be able to analyze Big Data. The model only returns weights and intercept. My problem

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread DB Tsai
In Spark 1.4, Logistic Regression with elasticNet is implemented in ML pipeline framework. Model selection can be achieved through high lambda resulting lots of zero in the coefficients. Sincerely, DB Tsai --- Blog: https://www.dbtsai.com On

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread Joseph Bradley
If you want to select specific variable combinations by hand, then you will need to modify the dataset before passing it to the ML algorithm. The DataFrame API should make that easy to do. If you want to have an ML algorithm select variables automatically, then I would recommend using L1