I am getting extremely poor SVM performance on a simple binary learning
problem. I am doing an exhaustive grid search, but most of the AUC scores I
obtain are below 0.5 (basically the performance of a random classifier)
Here is my feature matrix X:
https://gist.github.com/ribonoous/5952080
and here is my label vector y:
https://gist.github.com/ribonoous/5952067
and here is my code (also available in this
gist<https://gist.github.com/ribonoous/5952103>
):
# Set the parameters by cross-validation
my_exps = np.arange(-20,20, 2)
my_values = np.exp(C_exps)
tuned_parameters = [{'kernel': ['rbf'], 'gamma': my_values,
'C': my_values},
{'kernel': ['linear'], 'C': my_values}]
scores = [
('auc_score', auc_score),
]
from sklearn.cross_validation import StratifiedKFold
skf = StratifiedKFold(y,5)
for score_name, score_func in scores:
clf = GridSearchCV(SVC(C=1), tuned_parameters,
score_func=score_func,verbose=2, n_jobs=1, cv=skf)
clf.fit(X, y)
print "Grid scores:"
pprint(clf.grid_scores_)
print "Best score:"
pprint(clf.best_score_)
print "Classification report for the best estimator: "
print clf.best_estimator_
Am I using scikit-learn incorrectly? Is this expected?
Thank you,
Josh
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general