[ https://issues.apache.org/jira/browse/SYSTEMML-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15518637#comment-15518637 ]
Jeremy commented on SYSTEMML-700: --------------------------------- Hi Matthias, The solution I'm currently using in HydraR is to transform the labels from whatever values they are to 0, 1, 2 ... before hand, and then transform them back to their original labels after the algorithm runs. Currently the algorithm doesn't handle class values that don't start at 0 or 1, and doesn't handle non-contiguous integers, both of which can come up. For example, the result for class labels 4,5,6 will return 5 sets of coefficients (correct number should be 2), and class labels -1, 0, 1 returns just one set of coefficients (correct number should be 2). Handling frames with strings would be a really great user experience - that could look like R's coercion internally. Both glmnet and scikit-learn handle string label arguments, but both apis are weakly typed as well. > Inflexible category labels for Multinomial Logistic Regression > -------------------------------------------------------------- > > Key: SYSTEMML-700 > URL: https://issues.apache.org/jira/browse/SYSTEMML-700 > Project: SystemML > Issue Type: Bug > Components: Algorithms > Reporter: Jeremy > Priority: Minor > Original Estimate: 4h > Remaining Estimate: 4h > > The Logistic Regression algorithm requires that category labels be labeled as > 0 up to the number of classes-1. It should be able to handle any set of > category labels provided by the user. B_out should have the appropriate size > regardless of the values of the labels given, and the algorithm should also > preserve the original labeling for the user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)