On Sun, Aug 29, 2010 at 10:26 AM, Ralf Gommers <ralf.gomm...@googlemail.com> wrote: > > > On Sat, Aug 28, 2010 at 4:32 AM, David Huard <david.hu...@gmail.com> wrote: >> >> Nils and Joseph, >> Thanks for the bug report, this is now fixed in SVN (r8672). >> Ralph. is this something that you want to see backported in 1.5 ? > > From the other replies to your mail I gather your bug fix is still going to > change. If no other issues are reported I'm planning to do the final release > in two days, so it's a bit late for backporting.
There seems to be pretty much agreement on a new keyword like density=True to have the correct behavior available. Two days before a release is really not a good time, but adding an example to the docstring how to calculate the correct density histogram will be useful. Josef > > Thanks for asking, > Ralf > > >> >> Regards, >> David >> >> On Fri, Aug 6, 2010 at 7:49 PM, <josef.p...@gmail.com> wrote: >>> >>> On Fri, Aug 6, 2010 at 4:53 PM, Nils Becker <n.bec...@amolf.nl> wrote: >>> > Hi again, >>> > >>> > first a correction: I posted >>> > >>> >> I believe np.histogram(data, bins, normed=True) effectively does : >>> >>>> np.histogram(data, bins, normed=False) / (bins[-1] - bins[0]). >>> >>>> >>> >>>> However, it _should_ do >>> >>>> np.histogram(data, bins, normed=False) / bins_widths >>> > >>> > but there is a normalization missing; it should read >>> > >>> > I believe np.histogram(data, bins, normed=True) effectively does >>> > np.histogram(data, bins, normed=False) / (bins[-1] - bins[0]) / >>> > data.sum() >>> > >>> > However, it _should_ do >>> > np.histogram(data, bins, normed=False) / bins_widths / data.sum() >>> > >>> > Bruce Southey replied: >>> >> As I recall, there as issues with this aspect. >>> >> Please search the discussion regarding histogram especially David >>> >> Huard's reply in this thread: >>> >> http://thread.gmane.org/gmane.comp.python.numeric.general/22445 >>> > I think this discussion pertains to a switch in calling conventions >>> > which happened at the time. The last reply of D. Huard (to me) seems to >>> > say that they did not fix anything in the _old_ semantics, but that the >>> > new semantics is expected to work properly. >>> > >>> > I tried with an infinite bin: >>> > counts, dmy = np.histogram([1,2,3,4], [0.5,1.5,np.inf]) >>> > counts >>> > array([1,3]) >>> > ncounts, dmy = np.histogram([1,2,3,4], [0.5,1.5,np.inf], normed=1) >>> > ncounts >>> > array([0.,0.]) >>> > >>> > this also does not make a lot of sense to me. A better result would be >>> > array([0.25, 0.]), since 25% of the points fall in the first bin; 75% >>> > fall in the second but are spread out over an infinite interval, giving >>> > 0. This is what my second proposal would give. I cannot find anything >>> > wrong with it so far... >>> >>> I didn't find any different information about the meaning of >>> normed=True on the mailing list nor in the trac history >>> >>> 169 >>> 170 if normed: >>> 171 db = array(np.diff(bins), float) >>> 172 return n/(n*db).sum(), bins >>> >>> this does not look like the correct piecewise density with unequal >>> binsizes. >>> >>> Thanks Nils for pointing this out, I tried only equal binsizes for a >>> histogram distribution. >>> >>> Josef >>> >>> >>> >>> >>> >>> > >>> > Cheers, Nils >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion@scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion