[ https://issues.apache.org/jira/browse/SPARK-17151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429069#comment-15429069 ]
DB Tsai commented on SPARK-17151: --------------------------------- [~sethah] I think it sort of makes sense that we allow users to specify the number of classes if they want instead of inferring from the data. > Decide how to handle inferring number of classes in Multinomial logistic > regression > ----------------------------------------------------------------------------------- > > Key: SPARK-17151 > URL: https://issues.apache.org/jira/browse/SPARK-17151 > Project: Spark > Issue Type: Sub-task > Components: ML, MLlib > Reporter: Seth Hendrickson > Priority: Minor > > This JIRA is to discuss how the number of label classes should be inferred in > multinomial logistic regression. Currently, MLOR checks the dataframe > metadata and if the number of classes is not specified then it uses the > maximum value seen in the label column. If the labels are not properly > indexed, then this can cause a large number of zero coefficients and > potentially produce instabilities in model training. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org