[ 
https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793487#action_12793487
 ] 

Ted Dunning commented on MAHOUT-227:
------------------------------------

{quote}
I understand this concern. Actually, if we set the parameter k to 1,000,000
or higher, do you think it is reasonable to take advantage of Map-reduce
framework? I mean, from system implementation's view.
{quote}

If you increase the value of k to very large values, you will be able to get a 
bit more computation, but if you follow my small cluster example I think that 
increasing k from 1000 to 1,000,000 will likely increase efficiency from 0.1% 
to less than 50% and will drive the algorithm well beyond the region were kT is 
constant.  You will still have quite a lot of I/O per cycle which may prevent 
you from achieving even 10% efficiency.

For  larger clusters, the problem will be much worse.

Go ahead and try it, though.  Your real results count for more than my 
estimates.

And as I said before, getting a good sequential implementation is of real value 
as well.

> Parallel SVM
> ------------
>
>                 Key: MAHOUT-227
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-227
>             Project: Mahout
>          Issue Type: Task
>          Components: Classification
>            Reporter: zhao zhendong
>         Attachments: ParallelPegasos.doc, ParallelPegasos.pdf
>
>
> I wrote a proposal of parallel algorithm for SVM training. Any comment is 
> welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to