I wish there was a native numpy function for this case, which is
fairly common in information theory quantities.
As a workaround, I sometimes use these reasonably efficient utility functions:
def log0(x):
"""Robust 'entropy' logarithm: log(0.) = 0."""
return np.where(x==0., 0., np.log(x))
def log0_no_warning(x):
"""Robust 'entropy' logarithm: log(0.) = 0.
This version does not raise any warning when values of x=0. are first
encountered. However, it is slightly more inefficient."""
with np.errstate(divide='ignore'):
res = np.where(x==0., 0., np.log(x))
return res
On Fri, Oct 14, 2011 at 10:31 AM, Olivier Grisel
<[email protected]> wrote:
> 2011/10/14 Robert Layton <[email protected]>:
>> I'm working on adding Adjusted Mutual Information, and need to calculate the
>> Mutual Information.
>> I think I have the algorithm itself correct, except for the fact that
>> whenever the contingency matrix is 0, a nan happens and propogates through
>> the code.
>>
>> Sample code on the net [1] uses an eps=np.finfo(float).eps. Should I do
>> this, adding eps to anything that is a denominator or parameter to log?
>> Is there a better way?
>
> I would rather filter out any entry that has a 0.0 in the denominator
> before the final sum using array masking.
>
> BTW, thanks for tackling this.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general