Hans Georg Krauthaeuser schrieb: > Hans Georg Krauthaeuser schrieb: > >> Hi All, >> >> I was playing with scipy.stats.itemfreq when I observed the following >> overflow: >> >> In [119]:for i in [254,255,256,257,258]: >> .....: l=[0]*i >> .....: print i, stats.itemfreq(l), l.count(0) >> .....: >> 254 [ [ 0 254]] 254 >> 255 [ [ 0 255]] 255 >> 256 [ [0 0]] 256 >> 257 [ [0 1]] 257 >> 258 [ [0 2]] 258 >> >> itemfreq is pretty small (in stats.py): >> >> ---------------------------------------------------------------------- >> def itemfreq(a): >> """ >> Returns a 2D array of item frequencies. Column 1 contains item values, >> column 2 contains their respective counts. Assumes a 1D array is passed. >> >> Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies) >> """ >> scores = _support.unique(a) >> scores = sort(scores) >> freq = zeros(len(scores)) >> for i in range(len(scores)): >> freq[i] = add.reduce(equal(a,scores[i])) >> return array(_support.abut(scores, freq)) >> ---------------------------------------------------------------------- >> >> It seems that add.reduce is the source for the overflow: >> >> In [116]:from scipy import * >> >> In [117]:for i in [254,255,256,257,258]: >> .....: l=[0]*i >> .....: print i, add.reduce(equal(l,0)) >> .....: >> 254 254 >> 255 255 >> 256 0 >> 257 1 >> 258 2 >> >> Is there any possibility to avoid the overflow? >> >> BTW: >> Python 2.3.5 (#2, Aug 30 2005, 15:50:26) >> [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 >> >> scipy_version.scipy_version --> '0.3.2' >> >> >> Thanks and best regards >> Hans Georg Krauthäuser > > After some further investigation: > > In [150]:add.reduce(array(equal([0]*256,0),typecode='l')) > Out[150]:256 > > In [151]:add.reduce(equal([0]*256,0)) > Out[151]:0 > > The problem occurs with arrays with typecode 'b' (as returned by equal). > > Workaround patch for itemfreq is obvious, but ... is it a bug or a feature? > > regards > Hans Georg
I feel a bit lonely here, but, nevertheless a further remark: The problem comes directly from the ufunc 'add' for typecode 'b'. In contrast to 'multiply' the typecode is not 'upcasted': In [178]:array(array([1],'b')*2) Out[178]:array([2],'i') In [179]:array(array([1],'b')+array([1],'b')) Out[179]:array([2],'b') So, for a array a with typecode 'b' it follows that a+a != a*2 At the moment, I don't have the time to try the new scipy_core. It would be nice to hear whether the problem is known or even already fixed!? Regards Hans Georg Krauthäuser -- http://mail.python.org/mailman/listinfo/python-list