[ 
https://issues.apache.org/jira/browse/SPARK-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054470#comment-14054470
 ] 

Gang Bai commented on SPARK-2303:
---------------------------------

This change has been merged into another JIRA SPARK-2311. Closing this one.

> Poisson regression model for count data
> ---------------------------------------
>
>                 Key: SPARK-2303
>                 URL: https://issues.apache.org/jira/browse/SPARK-2303
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Gang Bai
>
> Modeling count data is of great importance in solving real-world statistic 
> problems. Currently mllib.regression provides models mostly for numeric data, 
> i.e fitting curves with various regularization on resulted weights, but still 
> lacks the support of count data modeling.
> A very basic model for this is the Poisson regression. Following the patterns 
> in mllib and reusing the components, we address the parameter estimation for 
> Poisson regression in a maximum likelihood manner. In detail, to add Poisson 
> regression to mllib.regression, we need to:
>  # Add the gradient of the negative log-likelihood into 
> mllib/optimization/Gradients.scala.
>  # Add the implementations of PoissonRegressionModel, which extends 
> GeneralizedLinearModel with RegressionModel. Here we need the implementation 
> of the predict method.
>  # Add the implementations of the generalized linear algorithm class. Here we 
> can use either LBFGS or GradientDescent as the optimizer. So we implement 
> both as class PoissonRegressionWithSGD and class PoissonRegressionWithLBFGS 
> respectively.
>  # Add the companion object PoissonRegressionWithSGD and 
> PoissonRegressionWithLBFGS as drivers.
>  # Test suites
>  ## Test the gradient computation.
>  ## Test the regression method using generated data, which requires a 
> PoissonRegressionDataGenerator.
>  ## Test the regression method using a real-world data set.
>  # Add the documents.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to