Alexander Ulanov created SPARK-2329: ---------------------------------------
Summary: Add multi-label evaluation metrics Key: SPARK-2329 URL: https://issues.apache.org/jira/browse/SPARK-2329 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.0.0 Reporter: Alexander Ulanov Fix For: 1.1.0 There is no class in Spark MLlib for measuring the performance of multi-label classifiers. Multilabel classification is when the document is labeled with several labels (classes). This task involves adding the class for multilabel evaluation and unit tests. The following measures are to be implemented: Precision, Recall and F1-measure (1) based on documents averaged by the number of documents; (2) per label; (3) based on labels micro and macro averaged; (4) Hamming loss. Reference: Tsoumakas, Grigorios, Ioannis Katakis, and Ioannis Vlahavas. "Mining multi-label data." Data mining and knowledge discovery handbook. Springer US, 2010. 667-685. -- This message was sent by Atlassian JIRA (v6.2#6252)