And I agree with Ted, the non-linearity induced by most kernel functions are overcomplex and can easily overfit. Deep learning is a more reliable abstraction.

On 10/22/2014 01:42 PM, peng wrote:
Is the kernel projection referring to online/incremental incomplete Cholesky decomposition? Sorry I haven't used SVM for a long time and didn't keep up with SotA.

If that's true, I haven't find an out-of-the-box implementation, but this should be easy.

Yours Peng

On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
Andrew, thanks a bunch for the pointers!



On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <ap....@outlook.com> wrote:

If you do want to stick with SVM-

This is a question that I keep coming back myself to and unfortunately
have forgotten more (and lost more literature) than I’ve retained.
I believe that the most easily parallelizable sections of libSVM for small
datasets are (for C-SVC(R), RBF and polynomial kernels):

     1.    The Kernel projections
2. The Hyper-Paramater grid search for C, \gamma (I believe this
is now included in LibSVM- I havent looked at it in a while)
3. For multi-class SVC: the concurrent computation of each SVM for
each one-against-one class vote.

I’m unfamiliar with any easily parallizable method for QP itself.
Unfortunately for (2), (3) this involves broadcasting the entire dataset out to each node of a cluster (or working in a shared memory environment) so may not be practical depending on the size of your data set. I’ve only ever implemented (2) for relatively small datasets using MPI and and a with
pure java socket implementation.

Other approaches (further from simple LibSVM), which are more applicable
to large datasets (I’m less familiar with these):

4. Divide and conquer the QP/SMO problem and solve (As I’ve said,
I’m unfamiliar with this and  I don’t know of any standard)

     5.    Break the training set into subsets and solve.

For (5) there are several approaches, two that I know of are ensemble
approaches and those that accumulate Support Vectors from each partition and heuristically keep/reject them until the model converges. As well I’ve read some just read some research on implementing this in map a MapReduce
style[2].

I came across this paper [1] last night which you may find interesting as
well which is an interesting comparison of some SVM parallelization
strategies, particularly it discusses (1) for a shared memory environment
and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
several other nice papers discussing SVM parallelization strategies
especially for (5).  Also then goes on to discuss more purely linear
algebra approach to optimizing SVMs (sec. 5)

Also regarding (5) you may be interested in [2] (something I’ve only
looked over briefly).

[1] http://arxiv.org/pdf/1404.1066v1.pdf
[2] http://arxiv.org/pdf/1301.0082.pdf


From: ted.dunn...@gmail.com
Date: Tue, 21 Oct 2014 17:32:22 -0700
Subject: Re: Any idea which approaches to non-liniear svm are easily
parallelizable?
To: dev@mahout.apache.org

Last I heard, the best methods pre-project and do linear SVM.

Beyond that, I would guess that deep learning techniques would subsume
non-linear SVM pretty easily.  The best parallel implementation I know
for
that is in H2O.



On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <dlie...@gmail.com>
wrote:
in particular, from libSVM --
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?

thanks.
-d



Reply via email to