[Scikit-learn-general] choice of regularization parameter grid for elastic net

James Jensen Fri, 11 Oct 2013 13:08:04 -0700

How is the default grid of alphas and L1 ratios chosen forscikit-learn's enet_cv, and what is the reasoning behind it? What otherapproaches exist for choosing this parameter grid, and what are theybased on?

I'm using elastic net to calculate regularized canonical correlation.Given data matrices X and Y, I find coefficient vectors a and b thatmaximize the correlation between Xa and Yb. This can be done byiteratively regressing X on Yb (to estimate a) and then Y on Xa (toestimate b), and repeating these two regressions until convergence.

This iterative approach means that I have to do the model selection alevel up from the regression (i.e. I can't use enet_cv or the likedirectly). I know I can choose from grids of parameters bycross-validation or permutation. But I am unsure about how tointelligently choose the sets of alpha and L1-ratio parameters to try.And since the parameters can be different for the two regressions, thisquadratically increases the number of parameter combinations to try, soI need to choose the grid carefully.


Some ideas I've had:

 * Perhaps the ratio of samples to features can rule out certain
   regularization parameter values, i.e. if there are many more samples
   than features, too weak of regularization would be inappropriate.
   Has this been formalized mathematically? Wouldn't it depend on how
   strong the signal is, too?
 * If the solution with a particular regularization strength is a
   vector of zeros (i.e. the regularization was too strong), then I can
   discard all stronger regularization parameters. This is obvious with
   only an L1 penalty; if alpha=0.1 is too strong, then alpha=0.5 will
   definitely also be too strong. I wonder about this in the case of
   elastic net. That is, if (alpha=0.1, l1_ratio=0.5) is too strong,
   does that mean (alpha=0.1, l1_ratio=0.9) will necessarily be too strong?
 * And perhaps I could start with a coarse grid and then try again with
   more detail in a promising section of it. Any ideas on the best way
   of doing this?

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] choice of regularization parameter grid for elastic net

Reply via email to