On Wed, Nov 14, 2012 at 03:32:34PM +0100, federico vaggi wrote: > I have been reading a few papers about using (penalized) linear regression to > recover networks from noisy biological data, and I thought they would make a > very useful addition to sklearn. In particular, there's a few really > interesting techniques described in this paper:
> http://www.sciencedirect.com/science/article/pii/S0005109811001075 > 1) The ability to specify ahead of time the expected sign of the > coefficients. > 2) The ability to tweak the coefficients recovered to obtain a stable matrix The scikit learn has a very rich set of penalized linear regression, because a bunch of the core developers use them heavily. Off course I am all for adding more, but before adding them, we need to really understand what they bring to the existing lot, and what are their pros and cons. As a side note, I find that for this purpose an academic paper about a method is just as useful as a TV commercial when it comes to choosing which car to buy. I cannot think of many major regression techniques that are not in the scikit yet. All those that are missing require a fair amount of work and understanding to merge in. I have only glanced throught the paper, and I note that a lot of it relies on semi-definite programming (not surprising given that one of the authors in Stephen Boyd). From what I have seen, getting these algorithm right requires a good deal of expertize. In addition, the formulation described in the paper seems fairly specific to a given setting. The general rule in the scikit-learn is to wait a bit (a couple of years) after a method has been published to see if it pick up massive adoption or not. I guess that what I am saying is that I would be surprised, given the amount of work that has gone in the scikit-learn linear models, that their are more low hanging fruits in the widely useful methods. I'd love to be proven wrong :) Thanks for your suggestions. Gaël By the way, if you want to constrain the sign of a coefficient, just flip the sign of the corresponding column of X and impose positivity. ------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
