I did something similar where I was using GridSearchCV over different kernel functions for SVM and not all kernel functions use the same parameters. For example, the *degree* parameter is only used by the *poly* kernel.
from sklearn import svm from sklearn import cross_validation from sklearn import grid_search params = [{'kernel':['poly'],'degree':[1,2,3],'gamma':[1/p,1,2],'coef0':[-1,0,1]},\ {'kernel':['rbf'],'gamma':[1/p,1,2],'degree':[3],'coef0':[0]},\ {'kernel':['sigmoid'],'gamma':[1/p,1,2],'coef0':[-1,0,1],'degree':[3]}] GSC = grid_search.GridSearchCV(estimator = svm.SVC(), param_grid = params,\ cv = cvrand, n_jobs = -1) This worked in this instance because the svm.SVC() object only passes parameters to the kernel functions as needed: [image: Inline image 1] Hence, even though my list of dicts includes all three parameters for all types of kernels I used, they were selectively ignored. I'm not sure about parameters for the distance metrics for the KNN object, but it's a good bet it works the same way. Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD Editor-in-Chief, European Journal of Mathematical Sciences Executive Editor, European Journal of Pure and Applied Mathematics www.andrewhowe.com http://www.linkedin.com/in/ahowe42 https://www.researchgate.net/profile/John_Howe12/ I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Mon, Jun 27, 2016 at 1:27 PM, Hugo Ferreira <h...@inesctec.pt> wrote: > Hello, > > I have posted this question in Stackoverflow and did not get an answer. > This seems to be a basic usage question and am therefore sending it here. > > I have following code snippet that attempts to do a grid search in which > one of the grid parameters are the distance metrics to be used for the KNN > algorithm. The example below fails if I use "wminkowski", "seuclidean" or > "mahalanobis" distances metrics. > > # Define the parameter values that should be searched > k_range = range(1,31) > weights = ['uniform' , 'distance'] > algos = ['auto', 'ball_tree', 'kd_tree', 'brute'] > leaf_sizes = range(10, 60, 10) > metrics = ["euclidean", "manhattan", "chebyshev", "minkowski", > "mahalanobis"] > > param_grid = dict(n_neighbors = list(k_range), weights = weights, > algorithm = algos, leaf_size = list(leaf_sizes), metric=metrics) > param_grid > > # Instantiate the algorithm > knn = KNeighborsClassifier(n_neighbors=10) > > # Instantiate the grid > grid = GridSearchCV(knn, param_grid=param_grid, cv=10, scoring='accuracy', > n_jobs=-1) > > # Fit the models using the grid parameters > grid.fit(X,y) > > I assume this is because I have to set or define the ranges for the > various distance parameters (for example p, w for “wminkowski” - > WMinkowskiDistance ). The "minkowski" distance may be working because its > "p" parameter has the default 2. > > So my questions are: > > 1. Can we set the range of parameters for the distance metrics for the > grid search and if so how? > 2. Can we set the value of a parameters for the distance metrics for the > grid search and if so how? > > Hope the question is clear. > TIA > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn