Github user dlwh commented on the pull request:

    https://github.com/apache/incubator-spark/pull/575#issuecomment-35220185
  
    @martinjaggi I've often found that minibatching makes things converge much
    more quickly, since you get a nice variance reduction in the estimate of
    the gradient, and doesn't prevent any of the other tricks you described.
    That said, I mostly deal with structured prediction, not classification, so
    I'll defer to your experience.
    
    
    On Sun, Feb 16, 2014 at 3:18 PM, Martin Jaggi 
<[email protected]>wrote:
    
    > @dlwh <https://github.com/dlwh> Thanks! This is of course a nice idea.
    > Perhaps surprisingly (and good for us) such tricks seem not even necessary
    > in the current state of the art algorithms. It's usually faster to do the
    > smaller but earlier updates after each dot-product, i.e. each 
worker/thread
    > doing one dot product and then immediately updating its weight vector
    > (typical in SGD for example).
    >
    > Taking a step back, I think the PR by @mengxr 
<https://github.com/mengxr>here is very nice and providing the right kind of 
interface for all stuff
    > relying on vectors. (Just saying that we have to keep an eye on
    > serialization speed, but that seems well possible with the current code
    > structure, right?)
    >
    > —
    > Reply to this email directly or view it on 
GitHub<https://github.com/apache/incubator-spark/pull/575#issuecomment-35219684>
    > .
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---

Reply via email to