Hi George,
completely agreed that np.unique on continuous targets is messy - I have
run into the same problem.
If I remember correctly, you can work around this by using sample_weight to
inject the continuous target into the cross entropy loss:
If p_i are the targets, then duplicate each sample,
On Sat, Oct 3, 2015 at 11:54 PM, George Bezerra wrote:
> Thanks a lot Josef. I guess it is possible to do what I wanted, though
> maybe not in scikit. Does the statsmodels version allow l1 or l2
> regularization? I'm planning to use a lot of features and let the model
> decide what is good.
>
>
s
Thanks a lot Josef. I guess it is possible to do what I wanted, though
maybe not in scikit. Does the statsmodels version allow l1 or l2
regularization? I'm planning to use a lot of features and let the model
decide what is good.
Thanks again.
On Sat, Oct 3, 2015 at 11:20 PM, wrote:
> Just to co
Just to come in here as an econometrician and statsmodels maintainer.
statsmodels intentionally doesn't enforce binary data for Logit or similar
models, any data between 0 and 1 is fine.
Logistic Regression/Logit or similar Binomial/Bernoulli models can
consistently estimate the expected value (p
*I meant section 5.
On Sat, Oct 3, 2015 at 11:07 PM, George Bezerra wrote:
> Thanks Sebastian.
>
> I am trying to follow this paper:
> http://research.microsoft.com/en-us/um/people/mattri/papers/www2007/predictingclicks.pdf
> (check out section 6.2). They use logistic regression as a regression
Thanks Sebastian.
I am trying to follow this paper:
http://research.microsoft.com/en-us/um/people/mattri/papers/www2007/predictingclicks.pdf
(check out section 6.2). They use logistic regression as a regression model
to predict the click through rate (which is continuous).
A linear regression mod
Hi, George,
logistic regression is a binary classifier by nature (class labels 0 and 1).
Scikit-learn supports multi-class classification via One-vs-One or One-vs-All
though; and there is a generalization (softmax) that gives you meaningful
probabilities for multiple classes (i.e., class probabi
Hi there,
I would like to train a logistic regression model on a continuous (i.e.,
not categorical) target variable. The target is a probability, which is why
I am using a logistic regression for this problem. However, the sklearn
function tries to find the class labels by running a unique() on th