Re: GLM Poisson Model - Deviance calculations

2018-04-19 Thread Sean Owen
I see, this was handled for binomial deviance by the 'ylogy' method, which
computes y log (y / mu), defining this to be 0 when y = 0. It's not
necessary to add a delta or anything; 0 is the limit as y goes to 0 so it's
fine.

 The same change is appropriate for Poisson deviance. Gamma deviance looks
like it also has this issue but I suppose it isn't defined at 0 anyway. I
don't know if implementations still try to return something that isn't NaN
or what here.

Anyway, I think it's fine to open a JIRA and PR to make that change.

On Wed, Apr 18, 2018 at 9:30 PM svattig  wrote:

> Yes i’m referring to that method deviance. It fails when ever y is 0. I
> think
> R deviance calculation logic checks if y is 0 and assigns 1 to y for such
> cases.
>
> There are few deviances Like nulldeviance, residualdiviance and deviance
> that Glm regression summary object has.
> You might want to check those as well so the toString method doesn’t fail.
>
> Thank you,
> Srikar.V
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: GLM Poisson Model - Deviance calculations

2018-04-18 Thread svattig
Yes i’m referring to that method deviance. It fails when ever y is 0. I think
R deviance calculation logic checks if y is 0 and assigns 1 to y for such
cases.

There are few deviances Like nulldeviance, residualdiviance and deviance
that Glm regression summary object has.
You might want to check those as well so the toString method doesn’t fail.

Thank you,
Srikar.V



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: GLM Poisson Model - Deviance calculations

2018-04-18 Thread Joseph PENG
Are you referring this?

   override def deviance(y: Double, mu: Double, weight: Double): Double = {
  2.0 * weight * (y * math.*log(y / mu)* - (y - mu))
}

Not sure how does R handle this, but my guess is they may add a small
number, e.g. 0.5, to the numerator and denominator. If you can confirm
that's the issue, I will look into it.

On Wed, Apr 18, 2018 at 6:46 PM, Sean Owen  wrote:

> GeneralizedLinearRegression.ylogy seems to handle this case; can you be
> more specific about where the log(0) happens? that's what should be fixed,
> right? if so, then a JIRA and PR are the right way to proceed.
>
> On Wed, Apr 18, 2018 at 2:37 PM svattig 
> wrote:
>
>> In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's)
>> is
>> fit, the Deviance calculations are broken as result of log(0). I think
>> this
>> is the same case as in spark 2.2.
>> But the new toString method in Spark 2.3's
>> GeneralizedLinearRegressionTrainingSummary class is throwing error at
>> line
>> 1551 with NumberFormatException. Due to this exception, we are not able to
>> get the summary object from Model fit.
>>
>> Can the toString method be fixed including Deviance calculations for
>> example
>> taking log(1) when ever the count is 0 instead of having log(0) ?
>>
>> Thanks,
>> Srikar.V
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: GLM Poisson Model - Deviance calculations

2018-04-18 Thread Sean Owen
GeneralizedLinearRegression.ylogy seems to handle this case; can you be
more specific about where the log(0) happens? that's what should be fixed,
right? if so, then a JIRA and PR are the right way to proceed.

On Wed, Apr 18, 2018 at 2:37 PM svattig  wrote:

> In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) is
> fit, the Deviance calculations are broken as result of log(0). I think this
> is the same case as in spark 2.2.
> But the new toString method in Spark 2.3's
> GeneralizedLinearRegressionTrainingSummary class is throwing error at line
> 1551 with NumberFormatException. Due to this exception, we are not able to
> get the summary object from Model fit.
>
> Can the toString method be fixed including Deviance calculations for
> example
> taking log(1) when ever the count is 0 instead of having log(0) ?
>
> Thanks,
> Srikar.V
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


GLM Poisson Model - Deviance calculations

2018-04-18 Thread svattig
In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's) is
fit, the Deviance calculations are broken as result of log(0). I think this
is the same case as in spark 2.2. 
But the new toString method in Spark 2.3's
GeneralizedLinearRegressionTrainingSummary class is throwing error at line
1551 with NumberFormatException. Due to this exception, we are not able to
get the summary object from Model fit.

Can the toString method be fixed including Deviance calculations for example
taking log(1) when ever the count is 0 instead of having log(0) ?

Thanks,
Srikar.V



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org