FWIW I had rewritten the entropy() loop to be:
for (int element : elements) {
if (element > 0) {
result += element * (Math.log(element / sum));
}
}
and then further to
double logSum = Math.log(sum);
for (int element : elements) {
if (element > 0) {
result += element * (Math.log(element) - logSum);
}
}
and I come up with at least a small positive value:
7.465509716331835E-5
Since it is not negative, somehow it strikes me as a change that makes
the result right-er (though mathmetically they ought to be the same).
It's a small optimization.
Shall I commit something like that, but also cap the LLR at 0 anyhow?
that fixes the original issue for sure.
On Thu, Apr 29, 2010 at 5:28 PM, Sean Owen <[email protected]> wrote:
> Ah yeah that's it.
>
> So... is the better change to cap the result of logLikelihoodRatio() at 0.0?
>
> On Thu, Apr 29, 2010 at 5:11 PM, Ted Dunning <[email protected]> wrote:
>> I suspect round-off error. In R I get this for the raw LLR:
>>
>>> llr(matrix(c(6,7567, 1924, 2426487), nrow=2))
>> [1] 3.380607e-11
>>
>> A slightly different implementation might well have gotten a small negative
>> number here.
>