Hi,

Thank you for your advice. Sure, implementing Jaggi’s paper is not enough in 
practise due to the time consuming step of calculating the conditional 
gradient. For some strongly convex function over strongly convex sets, the 
convergence rate is good:
http://jmlr.org/proceedings/papers/v37/garbera15.pdf 
<http://jmlr.org/proceedings/papers/v37/garbera15.pdf>
I may also consider this option in implementation, but for more general 
functions, we need more techniques.

Best regards,
Tyler Zhu (Rui)
University of Alberta

> On Mar 21, 2017, at 4:27 PM, Stephen Tu <[email protected]> wrote:
> 
> Essentially, I'm trying to say that a good candidate for this project should 
> be comfortable digging into the academic literature in order to address 
> issues that come up, and enjoy working on more open ended projects. It really 
> helps to have a specific set of problem domains in mind-- like for instance, 
> the paper I linked to above specializes FW into the setting where you are 
> trying to do sparse signal recovery via atomic norm regularization 
> (applications here include line spectral estimation, system ID, etc). 
> 
> On Tue, Mar 21, 2017 at 3:22 PM, Stephen Tu <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> So the issue w/ the FW algorithm is that in practice, you need more than just 
> Jaggi's paper to make it converge fast (1/T rates are just too slow given the 
> computational cost of the conditional gradient step). For instance, see this 
> paper https://arxiv.org/pdf/1404.5692.pdf 
> <https://arxiv.org/pdf/1404.5692.pdf> for some tricks where they try to 
> re-optimize the coefficients at every iteration. 
> 
> For the GSOC project I think the idea would be implement FW, and then try out 
> some of these enhancements. I think this project is pretty open ended in this 
> regard. 
> 
> On Tue, Mar 21, 2017 at 2:52 PM, Tyler Zhu <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Ryan,
> 
> Thanks for your reply. I think GSoC is not only popular in UAlberta, but also 
> all around the world. I’m really excited to see friends loving coding and 
> machine learning from so many countries.
> 
> Thanks for providing previous discussion on this idea. It seems that this 
> idea has been widely discussed and I’ve seen many potential works around it. 
> I agree with your point that F-W algorithm is just the fundamental. Low-rank 
> matrix completion, OMP and other applications can be applied. Also I found 
> some papers discussing the parallel/distributed implementations of the F-W 
> algorithm. We can think about it but implementing F-W and its applications 
> would be the first and the essential step.
> 
> Best regards,
> Tyler Zhu (Rui)
> University of Alberta
> 
> > On Mar 21, 2017, at 3:33 PM, Ryan Curtin <[email protected] 
> > <mailto:[email protected]>> wrote:
> >
> > On Mon, Mar 20, 2017 at 11:48:05PM -0600, Tyler Zhu wrote:
> >> Dear All,
> >>
> >> Nice to meet you guys here! My name is Rui, a PhD student in
> >> University of Alberta. My research interests include distributed
> >> machine learning, low-rank matrix completion, and other exciting
> >> topics. I have been an MLPack user for a while in my research and I’m
> >> really fascinated by this great library and I’m willing to make
> >> contributions to this great machine learning framework.
> >
> > Hi Rui!
> >
> > It seems GSoC is quite popular among University of Alberta students this
> > year. :)
> >
> >> I’m quite interested in “Low rank/sparse optimization using
> >> Frank-Wolfe". Frank-Wolfe is an old but famous algorithm in
> >> optimization. It has the same convergence rate with (projected)
> >> gradient descent, which is a more popular method in optimization. If
> >> it is not efficient to calculate the projection, Frank-Wolfe is one of
> >> the best alternative. Such example can be found in various papers,
> >> e.g., low-rank matrix completion and orthogonal matching pursuit.
> >>
> >> I think this algorithm can be implemented as a new optimizer since it
> >> can be used in may applications. My preliminary plan is to borrow some
> >> ideas from existing implementations of other optimizers for this
> >> project, and the major work is to implement the “argmin” step, or also
> >> called “conditional gradient” in the literature.
> >
> > I think this is a good approach.  Keep in mind that just the
> > implementation of F-W might not be enough to fill a whole summer, so it
> > may be worthwhile to propose some additional work on top of that.  Maybe
> > some extensions of F-W or some applications might work.  There has been
> > quite some discussion on this project over the years on the mailing
> > list; you might check out some of those posts for more information:
> >
> > http://lists.mlpack.org/pipermail/mlpack/2017-March/003185.html 
> > <http://lists.mlpack.org/pipermail/mlpack/2017-March/003185.html>
> > http://lists.mlpack.org/pipermail/mlpack/2016-March/002575.html 
> > <http://lists.mlpack.org/pipermail/mlpack/2016-March/002575.html>
> >
> > Let me know if I can clarify anything.
> >
> > Thanks,
> >
> > Ryan
> >
> > --
> > Ryan Curtin    | "I am a golden god!"
> > [email protected] <mailto:[email protected]> |   - Russell
> 
> _______________________________________________
> mlpack mailing list
> [email protected] <mailto:[email protected]>
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack 
> <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack>
> 

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to