I just noticed this about the glm package in R.
http://stats.stackexchange.com/a/26779/53128

"
The glm function in R allows 3 ways to specify the formula for a
logistic regression model.

The most common is that each row of the data frame represents a single
observation and the response variable is either 0 or 1 (or a factor
with 2 levels, or other varibale with only 2 unique values).

Another option is to use a 2 column matrix as the response variable
with the first column being the counts of 'successes' and the second
column being the counts of 'failures'.

You can also specify the response as a proportion between 0 and 1,
then specify another column as the 'weight' that gives the total
number that the proportion is from (so a response of 0.3 and a weight
of 10 is the same as 3 'successes' and 7 'failures')."

Either of the last two options would do for me.  Does scikit-learn
support either of these last two options?

Raphael

On 10 October 2016 at 11:55, Raphael C <drr...@gmail.com> wrote:
> I am trying to perform regression where my dependent variable is
> constrained to be between 0 and 1. This constraint comes from the fact
> that it represents a count proportion. That is counts in some category
> divided by a total count.
>
> In the literature it seems that one common way to tackle this is to
> use logistic regression. However, it appears that in scikit learn
> logistic regression is only available as a classifier
> (http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
> ) . Is that right?
>
> Is there another way to perform regression using scikit learn where
> the dependent variable is a count proportion?
>
> Thanks for any help.
>
> Raphael
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to