[ https://issues.apache.org/jira/browse/SPARK-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193815#comment-15193815 ]
Joseph K. Bradley commented on SPARK-13712: ------------------------------------------- I don't think this is a high-priority item to add. It is much more expensive that one-vs-rest, and there are better methods we should add first. (Specifically, we should add a method based on error-correcting codes since it will be cheaper and have guarantees essentially as good as one-vs-one.) I'll comment on the PR as well. If you'd like, feel free to submit this as a Spark package since I'm sure some people would find it useful. Thanks! I'll close this JIRA for now. > Add OneVsOne to ML > ------------------ > > Key: SPARK-13712 > URL: https://issues.apache.org/jira/browse/SPARK-13712 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: zhengruifeng > Priority: Minor > > Another Meta method for multi-class classification. > Most classification algorithms were designed for balanced data. > The OneVsRest method will generate K models on imbalanced data. > The OneVsOne will train K*(K-1)/2 models on balanced data. > OneVsOne is less sensitive to the problems of imbalanced datasets, and can > usually result in higher precision. > But it is much more computationally expensive, although each model are > trained on a much smaller dataset. (2/K of total) > The OneVsOne is implemented in the way OneVsRest did: > val classifier = new LogisticRegression() > val ovo = new OneVsOne() > ovo.setClassifier(classifier) > val ovoModel = ovo.fit(data) > val predictions = ovoModel.transform(data) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org