Re: GLM Poisson Model - Deviance calculations
I see, this was handled for binomial deviance by the 'ylogy' method, which computes y log (y / mu), defining this to be 0 when y = 0. It's not necessary to add a delta or anything; 0 is the limit as y goes to 0 so it's fine. The same change is appropriate for Poisson deviance. Gamma deviance looks like it also has this issue but I suppose it isn't defined at 0 anyway. I don't know if implementations still try to return something that isn't NaN or what here. Anyway, I think it's fine to open a JIRA and PR to make that change. On Wed, Apr 18, 2018 at 9:30 PM svattig wrote: > Yes i’m referring to that method deviance. It fails when ever y is 0. I > think > R deviance calculation logic checks if y is 0 and assigns 1 to y for such > cases. > > There are few deviances Like nulldeviance, residualdiviance and deviance > that Glm regression summary object has. > You might want to check those as well so the toString method doesn’t fail. > > Thank you, > Srikar.V > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: GLM Poisson Model - Deviance calculations
Yes i’m referring to that method deviance. It fails when ever y is 0. I think R deviance calculation logic checks if y is 0 and assigns 1 to y for such cases. There are few deviances Like nulldeviance, residualdiviance and deviance that Glm regression summary object has. You might want to check those as well so the toString method doesn’t fail. Thank you, Srikar.V -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: GLM Poisson Model - Deviance calculations
Are you referring this? override def deviance(y: Double, mu: Double, weight: Double): Double = { 2.0 * weight * (y * math.*log(y / mu)* - (y - mu)) } Not sure how does R handle this, but my guess is they may add a small number, e.g. 0.5, to the numerator and denominator. If you can confirm that's the issue, I will look into it. On Wed, Apr 18, 2018 at 6:46 PM, Sean Owen wrote: > GeneralizedLinearRegression.ylogy seems to handle this case; can you be > more specific about where the log(0) happens? that's what should be fixed, > right? if so, then a JIRA and PR are the right way to proceed. > > On Wed, Apr 18, 2018 at 2:37 PM svattig > wrote: > >> In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) >> is >> fit, the Deviance calculations are broken as result of log(0). I think >> this >> is the same case as in spark 2.2. >> But the new toString method in Spark 2.3's >> GeneralizedLinearRegressionTrainingSummary class is throwing error at >> line >> 1551 with NumberFormatException. Due to this exception, we are not able to >> get the summary object from Model fit. >> >> Can the toString method be fixed including Deviance calculations for >> example >> taking log(1) when ever the count is 0 instead of having log(0) ? >> >> Thanks, >> Srikar.V >> >> >> >> -- >> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Re: GLM Poisson Model - Deviance calculations
GeneralizedLinearRegression.ylogy seems to handle this case; can you be more specific about where the log(0) happens? that's what should be fixed, right? if so, then a JIRA and PR are the right way to proceed. On Wed, Apr 18, 2018 at 2:37 PM svattig wrote: > In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) is > fit, the Deviance calculations are broken as result of log(0). I think this > is the same case as in spark 2.2. > But the new toString method in Spark 2.3's > GeneralizedLinearRegressionTrainingSummary class is throwing error at line > 1551 with NumberFormatException. Due to this exception, we are not able to > get the summary object from Model fit. > > Can the toString method be fixed including Deviance calculations for > example > taking log(1) when ever the count is 0 instead of having log(0) ? > > Thanks, > Srikar.V > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
GLM Poisson Model - Deviance calculations
In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) is fit, the Deviance calculations are broken as result of log(0). I think this is the same case as in spark 2.2. But the new toString method in Spark 2.3's GeneralizedLinearRegressionTrainingSummary class is throwing error at line 1551 with NumberFormatException. Due to this exception, we are not able to get the summary object from Model fit. Can the toString method be fixed including Deviance calculations for example taking log(1) when ever the count is 0 instead of having log(0) ? Thanks, Srikar.V -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org