A lot of this is discussed in http://scikit-learn.org/dev/modules/model_evaluation.html
If you passed only a limited set of labels in, micro average would not necessarily be identical across P/R/F. This allows for a "negative label", often an experimentally uninteresting majority class. Try classification_report(y_true, y_pred, target_names=target_names, labels=[1, 2]) If you passed in multilabel data, micro average would not necessarily be identical across P/R/F. Try classification_report(np.array([[1, 0], [0, 1]]), np.array([[1, 1], [0, 1]])). Perhaps for multiclass with labels=None, we could report this differently.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn