Say I define the following scoring function:

def multi_label_macro_auc(y_gt, y_pred):
    n_labels = y_pred.shape[1]
    auc_scores = [None] * n_labels
    for label in xrange(n_labels):
      auc_scores[label] =  roc_auc_score((y_gt == label)*1, y_pred[:,label])
    return np.mean(auc_scores)

ml_macro_auc_s   = make_scorer(multi_label_macro_auc,
greater_is_better=True, needs_threshold=False, needs_proba=True)

I then try to test a DummyClassifier:

dummy_clf = DummyClassifier(strategy='stratified'',random_state=0)
dummy_clf.fit(X,y)
*ml_macro_auc_s(dummy_clf, X, y)*

Scikit-learn then complains with:
"ValueError: AUC is defined for binary classification only"

I tried passing a probability=True to DummyClassifier, but it does
not<http://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html>seem
to accept that parameter:
dummy_clf = DummyClassifier(strategy='stratified'',random_state=0,
probability=True)

How can I apply my own scorers that require soft outputs to a
DummyClassifier?

PS: In the spirit of having an easy to google archive of the question and
facilitate syntax highlighting, I left a copy of this question
here<http://stackoverflow.com/questions/18236099/unable-to-test-a-dummy-classifier-with-a-score-function-that-requires-a-probabil>
.

Josh
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to