I don't think
v_measure_scorer = make_scorer(v_measure_score, labels_pred=kmeans.predict)
does what you think it does.
You should stick to
v_measure_scorer = make_scorer(v_measure_score)
On 15 May 2014 22:11, Lee Zamparo <[email protected]> wrote:
> Seems the estimator.fit method needs the true labels, and that I
> shouldn't pass either the true lables or the predicted labels to
> v_measure_score (passing either triggers an AttributeError). So now
> I'm running with
>
> # Make a scoring function for the pipeline
> v_measure_scorer = make_scorer(v_measure_score, labels_pred=kmeans.predict)
>
> # Parameters of pipelines are set using ‘__’ separated parameter names:
> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> scoring=v_measure_scorer)
> estimator.fit(D_scaled,D_labels)
>
> It's been running overnight, hopefully I get a result this morning.
> Thanks for all your help,
>
> L.
>
> On Wed, May 14, 2014 at 11:12 AM, Lee Zamparo <[email protected]> wrote:
> > Combining the helpful suggestions of Andy & Joel I'm tyring the
> following:
> >
> > # Make a scoring function for the pipeline
> > v_measure_scorer =
> >
> make_scorer(v_measure_score,labels_true=labels[:,0],labels_pred=kmeans.predict)
> >
> > # Parameters of pipelines are set using ‘__’ separated parameter names:
> > estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> > scoring=v_measure_scorer)
> > estimator.fit(D_scaled)
> >
> > Was this what you were referring to Andy?
> >
> > Thanks,
> >
> > Lee.
> >
> > On Wed, May 14, 2014 at 1:27 AM, Andreas Mueller <[email protected]>
> wrote:
> >> I think you should use the make_scorer function. Using labels_ will not
> >> work, as it will only have labels for the training split, while the
> >> performance is measured on the test split.
> >>
> >> On May 14, 2014 2:28 AM, "Joel Nothman" <[email protected]> wrote:
> >>>
> >>> Hi Lee,
> >>>
> >>> The scoring parameter, if not an existing scoring name, needs to be a
> >>> function with the signature:
> >>>
> >>> fn(estimator, X, y_true) -> score which increases with goodness
> >>>
> >>> So I think you want to define:
> >>>
> >>> def score_clusters(estimator, X, y):
> >>> return v_measure_score(y[:,0], kmeans.labels_))
> >>>
> >>> Then construct the GridSearchCV as:
> >>>
> >>> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> >>> scoring=score_clusters)
> >>>
> >>> It seems like there should be more predefined scorers available for
> >>> clustering...
> >>>
> >>> Cheers,
> >>>
> >>> - Joel
> >>>
> >>>
> >>> On 14 May 2014 09:10, Lee Zamparo <[email protected]> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I'm trying to use GridSearchCV and Pipeline to tune the gamma
> >>>> parameter of kernel PCA. I'd like to use kernel PCA to transform the
> >>>> data, followed by kmeans to cluster the data, followed by v-measure to
> >>>> measure the goodness of fit of the clustering.
> >>>>
> >>>> Here's the relevant snippet of my script
> >>>> -----
> >>>> # Set up the kPCA -> kmeans -> v-measure pipeline
> >>>> kpca = KernelPCA(kernel="rbf")
> >>>> kmeans = KMeans(n_clusters=3)
> >>>> pipe = Pipeline(steps=[('kpca', kpca), ('kmeans', kmeans)])
> >>>>
> >>>> # Range of parameters to consider for gamma in the RBF kernel for kPCA
> >>>> gammas = np.logspace(-10,2,num=100)
> >>>>
> >>>> # Parameters of pipelines are set using ‘__’ separated parameter
> names:
> >>>> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> >>>> scoring=v_measure_score(labels[:,0],kmeans.labels_))
> >>>> estimator.fit(D_scaled)
> >>>>
> >>>> -----
> >>>>
> >>>> Yet I get an AttributeError claiming that the kmeans object has no
> >>>> labels_ attribute.
> >>>>
> >>>> File "/home/lee/projects/SdA_reduce/utils/kernel_pca_pipeline.py",
> >>>> line 86, in <module>
> >>>> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> >>>> scoring=v_measure_score(labels[:,0],kmeans.labels_))
> >>>>
> >>>> AttributeError: 'KMeans' object has no attribute 'labels_'
> >>>>
> >>>> Does anyone have any tips on how I should restructure my snippet to
> >>>> get my desired outcome?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Lee.
> >>>>
> >>>>
> >>>>
> ------------------------------------------------------------------------------
> >>>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >>>> Instantly run your Selenium tests across 300+ browser/OS combos.
> >>>> Get unparalleled scalability from the best Selenium testing platform
> >>>> available
> >>>> Simple to use. Nothing to install. Get started now for free."
> >>>> http://p.sf.net/sfu/SauceLabs
> >>>> _______________________________________________
> >>>> Scikit-learn-general mailing list
> >>>> [email protected]
> >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >>>
> >>>
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >>> Instantly run your Selenium tests across 300+ browser/OS combos.
> >>> Get unparalleled scalability from the best Selenium testing platform
> >>> available
> >>> Simple to use. Nothing to install. Get started now for free."
> >>> http://p.sf.net/sfu/SauceLabs
> >>> _______________________________________________
> >>> Scikit-learn-general mailing list
> >>> [email protected]
> >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >>>
> >>
> >>
> ------------------------------------------------------------------------------
> >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >> Instantly run your Selenium tests across 300+ browser/OS combos.
> >> Get unparalleled scalability from the best Selenium testing platform
> >> available
> >> Simple to use. Nothing to install. Get started now for free."
> >> http://p.sf.net/sfu/SauceLabs
> >> _______________________________________________
> >> Scikit-learn-general mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >>
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.
> Get unparalleled scalability from the best Selenium testing platform
> available
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general