I think a long term strategy needs to be adopted for histogram. Right now there is a great confusion in what the "bins" keyword does. Right now it is defined as the lower edge of each bin, meaning that the last bin is open ended and [inf,bin0> does not exist. While this may not be the right thing to fix in 1.1.0, I would really like to see it fixed somewhere down the line.
On Apr 24, 2008, at 10:28 AM, Pauli Virtanen wrote: > Wed, 23 Apr 2008 16:20:41 -0400, David Huard wrote: >> I haven't found a way to fix histogram reliably without breaking the >> current behavior. There is a patch attached to the ticket, if the >> decision is to break histogram. > > Summary of the facts (again...): > > a) histogram's docstring does not match its behavior wrt > discarding data This is an easy fix and should definitively go into 1.1.0 :) > b) given variable-width bins, histogram(..., normed=True) > the results are wrong Also a quick fix that should be part of 1.1.0 > c) it might make more sense to handle discarding data in some > other way than what histogram does now I would like to see this, but it does not have to happen in 1.1.0 :) > I think there are now a couple of choices what to do with this: > > A) Change the semantics of histogram function. Old code using > histogram > will just simply break, maybe in mysterious ways Not really a satisfactory approach. I really don't mind, even though it would break some code of mine. I would rather see a better function and have to do some code changes, than the current confusion. Other people will likely disagree. > B) Rename the bins parameter to bin_edges or something else, so that > any old code using histogram immediately raises an exception that is > easily understood. Given this approach bin_edges would contain one more value than bins given that the right edge of the last bin has to be defined. > C) Create a new parameter with more sensible behavior and a name > different from "bins", and deprecate (at least giving sequences to) > the > "bins" parameter: put up a DeprecationWarning if the user does this, > but > still produce the same results as the old histogram. This way the user > can forward-port her code at leisure. I think this is probably the best approach to accommodate everyone. > So which one (or something else) do we choose for 1.1.0? > > -- > Pauli Virtanen Cheers Tommy _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion