Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread Michael Eickenberg
Hi George, completely agreed that np.unique on continuous targets is messy - I have run into the same problem. If I remember correctly, you can work around this by using sample_weight to inject the continuous target into the cross entropy loss: If p_i are the targets, then duplicate each sample,

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread josef.pktd
On Sat, Oct 3, 2015 at 11:54 PM, George Bezerra wrote: > Thanks a lot Josef. I guess it is possible to do what I wanted, though > maybe not in scikit. Does the statsmodels version allow l1 or l2 > regularization? I'm planning to use a lot of features and let the model > decide what is good. > > s

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread George Bezerra
Thanks a lot Josef. I guess it is possible to do what I wanted, though maybe not in scikit. Does the statsmodels version allow l1 or l2 regularization? I'm planning to use a lot of features and let the model decide what is good. Thanks again. On Sat, Oct 3, 2015 at 11:20 PM, wrote: > Just to co

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread josef.pktd
Just to come in here as an econometrician and statsmodels maintainer. statsmodels intentionally doesn't enforce binary data for Logit or similar models, any data between 0 and 1 is fine. Logistic Regression/Logit or similar Binomial/Bernoulli models can consistently estimate the expected value (p

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread George Bezerra
*I meant section 5. On Sat, Oct 3, 2015 at 11:07 PM, George Bezerra wrote: > Thanks Sebastian. > > I am trying to follow this paper: > http://research.microsoft.com/en-us/um/people/mattri/papers/www2007/predictingclicks.pdf > (check out section 6.2). They use logistic regression as a regression

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread George Bezerra
Thanks Sebastian. I am trying to follow this paper: http://research.microsoft.com/en-us/um/people/mattri/papers/www2007/predictingclicks.pdf (check out section 6.2). They use logistic regression as a regression model to predict the click through rate (which is continuous). A linear regression mod

Re: [Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread Sebastian Raschka
Hi, George, logistic regression is a binary classifier by nature (class labels 0 and 1). Scikit-learn supports multi-class classification via One-vs-One or One-vs-All though; and there is a generalization (softmax) that gives you meaningful probabilities for multiple classes (i.e., class probabi

[Scikit-learn-general] Using logistic regression on a continuous target variable

2015-10-03 Thread George Bezerra
Hi there, I would like to train a logistic regression model on a continuous (i.e., not categorical) target variable. The target is a probability, which is why I am using a logistic regression for this problem. However, the sklearn function tries to find the class labels by running a unique() on th