On Wed, Nov 14, 2012 at 03:32:34PM +0100, federico vaggi wrote:
> I have been reading a few papers about using (penalized) linear regression to
> recover networks from noisy biological data, and I thought they would make a
> very useful addition to sklearn.   In particular, there's a few really
> interesting techniques described in this paper:

> http://www.sciencedirect.com/science/article/pii/S0005109811001075

> 1)  The ability to specify ahead of time the expected sign of the 
> coefficients.

> 2)  The ability to tweak the coefficients recovered to obtain a stable matrix

The scikit learn has a very rich set of penalized linear regression,
because a bunch of the core developers use them heavily. Off course I am
all for adding more, but before adding them, we need to really understand
what they bring to the existing lot, and what are their pros and cons. As
a side note, I find that for this purpose an academic paper about a
method is just as useful as a TV commercial when it comes to choosing
which car to buy.

I cannot think of many major regression techniques that are not in the
scikit yet. All those that are missing require a fair amount of work and
understanding to merge in. I have only glanced throught the paper, and I
note that a lot of it relies on semi-definite programming (not surprising
given that one of the authors in Stephen Boyd). From what I have seen,
getting these algorithm right requires a good deal of expertize. In
addition, the formulation described in the paper seems fairly specific to
a given setting. The general rule in the scikit-learn is to wait a bit (a
couple of years) after a method has been published to see if it pick up
massive adoption or not.

I guess that what I am saying is that I would be surprised, given the
amount of work that has gone in the scikit-learn linear models, that
their are more low hanging fruits in the widely useful methods. I'd love
to be proven wrong :)

Thanks for your suggestions.

Gaël

By the way, if you want to constrain the sign of a coefficient, just flip
the sign of the corresponding column of X and impose positivity.

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to