Yaroslav Halchenko wrote:
Another aspect you might benefit from in case of SVM is the fact that
some samples are not influencing SVM performance (ie the non-support
vector ones). So, you can speed up n-fold (or pure leave-one-out)
considerably if there is only few SVs -- same strategy is used by
lightsvm:
1. train SVM on all samples
2. in n-fold testing check if testing set includes any of support
vectors. If not -- then you already know what would be the result (the
same) if you train the SVM without their participation ;)
This strategy is giving especially large speed up if number of chunks is
large (or each sample is a chunk like in leave-1-out) and number of SVs
is small.
Hmm... I'm not sure I understand your suggestion. Won't the support
vectors change with each new chunk in cross validation? Or at least the
coefficients won't be identical. Is there a paper you know of which
describes this? It seems like training on the whole set will ruin the
notion of iid distribution of the output error on each fold.
Just think may be about accounting for such scenario as well?
sorry for not being too up-to-point with the reply, but I will get
through your email/code some time later whenever I get a chance ;)
No problem... VSS is looming anyway, so polished code isn't my priority
now either ;)
-S
_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa