Thanks Tom. That's clear now. Sent from my phone - sorry to be brief and potential misspell.
The set of independent regressions described in Wikipedia is *not* an OvR model. It is just a (weird) way to understand the multinomial logistic regression model. OvR logistic regression and multinomial logistic regression are two different models. In multinomial logistic regression as a set of independent binary regressions as described in Wikipedia, you have K - 1 binary regressions between class k (k from 1 to K - 1) and class K. Whereas in OvR logistic regression you have K binary regressions between class k (k from 1 to K) and class "not class k". The normalization is therefore different. Indeed, in multinomial logistic regression as a set of independent binary regressions, you have (from the beginning) the property 1 = sum_k p(y = k). The normalization 1 / (1 + sum_{k=1}^{K - 1} p(y = k)) comes from the late computation of p(y = K) using this property. Whereas in OvR logistic regression, you only have 1 = p_k(y = k) + p_k(y != k). Therefore the probabilities p_k(y = k) do not sum to one, and you need to normalize them with sum_{k=1}^{K} p_k(y = k) to create a valid probability of the OvR model. This is done in the same way in OneVsRestClassifier (https://github.com/scikit-learn/scikit-learn/blob/1a850eb5b601f3bf0f88a43090f83c51b3d8c593/sklearn/multiclass.py#L350-L351). But I agree that this description of the multinomial model is quite confusing, compared to the log-linear/softmax description. Tom Le jeu. 7 févr. 2019 à 08:31, Guillaume Lemaître <g.lemaitr...@gmail.com> a écrit :
|
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn