I see, this was handled for binomial deviance by the 'ylogy' method, which
computes y log (y / mu), defining this to be 0 when y = 0. It's not
necessary to add a delta or anything; 0 is the limit as y goes to 0 so it's
fine.
The same change is appropriate for Poisson deviance. Gamma deviance
Yes i’m referring to that method deviance. It fails when ever y is 0. I think
R deviance calculation logic checks if y is 0 and assigns 1 to y for such
cases.
There are few deviances Like nulldeviance, residualdiviance and deviance
that Glm regression summary object has.
You might want to check
Are you referring this?
override def deviance(y: Double, mu: Double, weight: Double): Double = {
2.0 * weight * (y * math.*log(y / mu)* - (y - mu))
}
Not sure how does R handle this, but my guess is they may add a small
number, e.g. 0.5, to the numerator and denominator. If you can
GeneralizedLinearRegression.ylogy seems to handle this case; can you be
more specific about where the log(0) happens? that's what should be fixed,
right? if so, then a JIRA and PR are the right way to proceed.
On Wed, Apr 18, 2018 at 2:37 PM svattig wrote:
> In
In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) is
fit, the Deviance calculations are broken as result of log(0). I think this
is the same case as in spark 2.2.
But the new toString method in Spark 2.3's
GeneralizedLinearRegressionTrainingSummary class is throwing error