Hi Hugo,
Andrew's approach -- using a list of dicts to specify multiple parameter
grids -- is the correct one.
However, Andrew, you don't need to include parameters that will be
ignored into your parameter grid. The following will be effectively the
same:
params =
[{'kernel':['poly'],'degree':[1,2,3],'gamma':[1/p,1,2],'coef0':[-1,0,1]},
{'kernel':['rbf'],'gamma':[1/p,1,2]},
{'kernel':['sigmoid'],'gamma':[1/p,1,2],'coef0':[-1,0,1]}]
Joel
On 27 June 2016 at 20:59, Andrew Howe <[email protected]
<mailto:[email protected]>> wrote:
I did something similar where I was using GridSearchCV over
different kernel functions for SVM and not all kernel functions use
the same parameters. For example, the *degree* parameter is only
used by the *poly* kernel.
from sklearn import svm
from sklearn import cross_validation
from sklearn import grid_search
params =
[{'kernel':['poly'],'degree':[1,2,3],'gamma':[1/p,1,2],'coef0':[-1,0,1]},\
{'kernel':['rbf'],'gamma':[1/p,1,2],'degree':[3],'coef0':[0]},\
{'kernel':['sigmoid'],'gamma':[1/p,1,2],'coef0':[-1,0,1],'degree':[3]}]
GSC = grid_search.GridSearchCV(estimator = svm.SVC(), param_grid =
params,\
cv = cvrand, n_jobs = -1)
This worked in this instance because the svm.SVC() object only
passes parameters to the kernel functions as needed:
Inline image 1
Hence, even though my list of dicts includes all three parameters
for all types of kernels I used, they were selectively ignored. I'm
not sure about parameters for the distance metrics for the KNN
object, but it's a good bet it works the same way.
Andrew
<~~~~~~~~~~~~~~~~~~~~~~~~~~~>
J. Andrew Howe, PhD
Editor-in-Chief, European Journal of Mathematical Sciences
Executive Editor, European Journal of Pure and Applied Mathematics
www.andrewhowe.com <http://www.andrewhowe.com>
http://www.linkedin.com/in/ahowe42
https://www.researchgate.net/profile/John_Howe12/
I live to learn, so I can learn to live. - me
<~~~~~~~~~~~~~~~~~~~~~~~~~~~>
On Mon, Jun 27, 2016 at 1:27 PM, Hugo Ferreira <[email protected]
<mailto:[email protected]>> wrote:
Hello,
I have posted this question in Stackoverflow and did not get an
answer. This seems to be a basic usage question and am therefore
sending it here.
I have following code snippet that attempts to do a grid search
in which one of the grid parameters are the distance metrics to
be used for the KNN algorithm. The example below fails if I use
"wminkowski", "seuclidean" or "mahalanobis" distances metrics.
# Define the parameter values that should be searched
k_range = range(1,31)
weights = ['uniform' , 'distance']
algos = ['auto', 'ball_tree', 'kd_tree', 'brute']
leaf_sizes = range(10, 60, 10)
metrics = ["euclidean", "manhattan", "chebyshev", "minkowski",
"mahalanobis"]
param_grid = dict(n_neighbors = list(k_range), weights =
weights, algorithm = algos, leaf_size = list(leaf_sizes),
metric=metrics)
param_grid
# Instantiate the algorithm
knn = KNeighborsClassifier(n_neighbors=10)
# Instantiate the grid
grid = GridSearchCV(knn, param_grid=param_grid, cv=10,
scoring='accuracy', n_jobs=-1)
# Fit the models using the grid parameters
grid.fit(X,y)
I assume this is because I have to set or define the ranges for
the various distance parameters (for example p, w for
“wminkowski” - WMinkowskiDistance ). The "minkowski" distance
may be working because its "p" parameter has the default 2.
So my questions are:
1. Can we set the range of parameters for the distance metrics
for the grid search and if so how?
2. Can we set the value of a parameters for the distance metrics
for the grid search and if so how?
Hope the question is clear.
TIA
_______________________________________________
scikit-learn mailing list
[email protected] <mailto:[email protected]>
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
[email protected] <mailto:[email protected]>
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn