Michel Lemay created SPARK-23216: ------------------------------------ Summary: Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS Key: SPARK-23216 URL: https://issues.apache.org/jira/browse/SPARK-23216 Project: Spark Issue Type: Improvement Components: ML, MLlib Affects Versions: 2.2.1 Reporter: Michel Lemay
When training a classifier with large number of classes, performance sink. This is expected when using regular (log)SoftMax methods to compute the loss since it needs to normalize current class score with the sum of all other classes score. I think this would be helpful to have approximate methods like Hierarchical SoftMax, NCE, NEG, IS to speedup training. A paper comparing different methods for approximate normalization over all classes: [http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/AISTATS2017.pdf] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org