kernel as data input [was GraphLasso pull request and feature]

Mathieu Blondel Thu, 10 Nov 2011 11:46:33 -0800

On Thu, Nov 10, 2011 at 8:40 PM, Adrien <[email protected]> wrote:

> I agree with Andy and my use cases seem to be similar. To sum up my
> point of view:
> - not every one works only on large scale ;-)


But I think it's always important to think about scalability, not only
of learning but also of prediction, as kernel classifiers are slow at
making predictions. Too many algorithms are not usable in a real world
setting.

> - linear methods are trendy but I still believe in kernels
> - precomputing the full kernel matrices allows for optimization tricks
> that matter a lot in practice (e.g. parallelization)

Embarrassingly parallel computation of the matrix is a very good
point. I had forgotten about it. Anyway, I am not advocating against
pre-computed matrices, as they can be useful, but one needs to bare in
mind that they have a n_samples^2 complexity.

> Finally, with a small number of samples, I found it faster to compute
> all possible pairwise kernel evaluations offline (and in parallel
> because it is embarrassingly parallel of course), even though not all
> might be used, e.g. with SVMs. You don't know the support vectors in
> advance and changing C changes the SV. Therefore, you probably need to
> recompute kernel evaluations unless you cache them. Furthermore, the
> best C values on my problems are high ones and, therefore, I have almost
> all points as SV. That was also true in my experience on the Pascal VOC
> challenge with RBF chi-square kernels on Bag-of-Features.

If the best solutions for you are dense, you may also want to try
kernel ridge regression then. With the very efficient linear system
solvers in LAPACK it may even be computationally interesting.

Mathieu

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Consistent API for handling affinity / Gram / kernel as data input [was GraphLasso pull request and feature]

Reply via email to