On Sat, Nov 14, 2009 at 6:53 AM, Priit Laes <pl...@plaes.org> wrote: > Ühel kenal päeval, R, 2009-11-13 kell 13:36, kirjutas Ernest Adrogué: >> 13/11/09 @ 09:41 (+0200), thus spake Priit Laes: >> > Does anyone have a scenario where one would actually have both negative >> > and positive numbers (integers) in the list? >> >> Yes: when you have a random variable that is the difference >> of two (discrete) random variables. For example, if you measure >> the difference in number of days off per week because of sickness >> between two groups of people, you would end up with a discrete >> variable with both positive and negative integers. >> >> > So, how about numpy.histogram_discrete() that returns data the way >> > histogram() does: a list containing histogram values (ie counts) and >> > list of sorted items from min(input)...max(input). ? >> >> In my humble opinion, it would be nice. > \o/ > > I have pushed the preliminary version to: > http://github.com/plaes/numpy/commits/histogram_discrete > > It can currently handle datasets with negative items and weights. I'm > also planning to add optional range argument to the function, but I > first need to figure out how to parse the range=(min, max) using C > API... ;) > > numpy.histogram_discrete() returns list containing histogram value and > bins (hopefully this is the right definition) > > hist, bins = numpy.histogram_discrete(data) > > Example: > In [1]: import numpy > In [2]: data = numpy.random.poisson(3, 300) > In [3]: numpy.histogram_discrete(data) > Out[3]: > [array([15, 50, 72, 59, 52, 34, 8, 7, 3]), > array([0, 1, 2, 3, 4, 5, 6, 7, 8])] > In [4]: > In [5]: data = [-1, 5] > In [6]: numpy.histogram_discrete(data, weights=[2, 0]) > Out[6]: > [array([ 2., 0., 0., 0., 0., 0., 0.]), > array([-1, 0, 1, 2, 3, 4, 5])]
Sorry, I still don't see much reason to do this in c >>> data = [-1, 5] >>> c=np.bincount(data-np.min(data),weights=[2,0]) >>> x=np.arange(np.min(data),np.min(data)+len(c)) >>> c,x (array([ 2., 0., 0., 0., 0., 0., 0.]), array([-1, 0, 1, 2, 3, 4, 5])) >>> data = [11,5] >>> np.bincount(data,weights=[2,0]) array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 2.]) >>> np.arange(max(data)+1) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) >>> c=np.bincount(data-np.min(data),weights=[2,0]) >>> x=np.arange(np.min(data),np.min(data)+len(c)) >>> c,x (array([ 0., 0., 0., 0., 0., 0., 2.]), array([ 5, 6, 7, 8, 9, 10, 11])) Josef > > Priit :) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion