[Scikit-learn-general] randomized l1

Luca Puggini Wed, 25 Jun 2014 15:09:07 -0700

Hey,
sorry I did not understand. Are you asking me to change the code in order
to add these features?
I can probably rewrite the randomized l1 class in order to include these.
Only few changes are needed. On the other side I do not have a lot of
experience with github so I am not sure how to do this.
Let me know.
Luca



>

> > Hi,
> > I know that LARS is usually faster.
> > On the other side CD is often considered more robust. In particular in
> > situation p>>n the Lars is not able to include in the model more than n
> > variables.
> >
>
> Which doesn't mean that this is not a lasso solution - there always exists
> a lasso solution with a support J such that X_J is injective, i.e. |J| \leq
> n. On the other extreme, one can choose the minimum l_2 norm solution
> (minimizing exactly the same functional), which maximizes the support. This
> can also be done in homotopy algorithms such as LarsLasso, but happens to
> not be implemented in scikit-learn. Any convex combination of the two is
> also a solution, and there may be many others. CD may find a different one,
> but it would be neither better nor worse than the mentioned options in
> which concerns training error. In prediction, including as many correlated
> variables as possible may yield more stability.
>
>
>
> > I think that the best think to do would be to include the possibility to
> > choice which algorithm to use and leave Lars as the default choice.
> > I think that should also be included the option to use as penalty path
> the
> > lasso penalty path.  This will be closer to the original paper.
> >
>
> Would you be able to add this functionality? Alex's usecase was rather
> specific, and there may be other cases where it is indeed useful to have CD
> as a possibility. The most helpful thing in assessing this would be a
> benchmark showing the differences.
>
>
> > I have seen that the current choice of using  'aic' or 'bic' alpha does
> > not work well in some situations.
> > Hope this could help,
> > Luca
> >
> >
> >>
> >> > I was wondering if there is any reason of why the randomized l1
> >> algorithm
> >> > from the stability selection paper is implemented only using Lars
> Lasso
> >> and
> >> > not the coordinate descent algorithm.
> >> > I think than including a version of the algorithm with the coordinate
> >> > descent method would be very useful.
> >>
> >> because on our use case the lars was always faster. So there was no
> >> point supporting both.
> >>
> >> Best,
> >> Alex
> >>
> >
>

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] randomized l1

Reply via email to