In Spark 1.4, Logistic Regression with elasticNet is implemented in ML
pipeline framework. Model selection can be achieved through high
lambda resulting lots of zero in the coefficients.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Fri, May 22, 2015 at 1:19 AM, SparknewUser
<melanie.galloi...@gmail.com> wrote:
> I am new in MLlib and in Spark.(I use Scala)
>
> I'm trying to understand how LogisticRegressionWithLBFGS and
> LogisticRegressionWithSGD work.
> I usually use R to do logistic regressions but now I do it on Spark
> to be able to analyze Big Data.
>
> The model only returns weights and intercept. My problem is that I have no
> information about which variable is significant and which variable I had
> better
> to delete to improve my model. I only have the confusion matrix and the AUC
> to evaluate the performance.
>
> Is there any way to have information about the variables I put in my model?
> How can I try different variable combinations, do I have to modify the
> dataset
> of origin (e.g. delete one or several columns?)
> How are the weights calculated: is there a correlation calculation with the
> variable
> of interest?
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-how-to-get-the-best-model-with-only-the-most-significant-explanatory-variables-in-LogisticRegr-tp22993.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to