On Fri, Jul 20, 2012 at 10:11 AM, Andreas Hilboll <li...@hilboll.de> wrote: > Hi, > > I have a problem using histogram2d: > > from numpy import linspace, histogram2d > bins_x = linspace(-180., 180., 360) > bins_y = linspace(-90., 90., 180) > data_x = linspace(-179.96875, 179.96875, 5760) > data_y = linspace(-89.96875, 89.96875, 2880) > histogram2d(data_x, data_y, (bins_x, bins_y)) > > AttributeError: The dimension of bins must be equal to the dimension of > the sample x. > > I would expect histogram2d to return a 2d array of shape (360,180), which > is full of 256s. What am I missing here? >
It is a joint histogram, so the x and y inputs represent each dimension of a 2-dimensional sample. So, the x and y arrays must be the same length. (the documentation does appear to be incorrect here). The bins do not need to have the same length. Here is your example adjusted (with many fewer bins so I could print it in the console) - note since you just have two "ramps" from linspace as the data, most of the points are near the diagonal. In [15]: bins_x = linspace(-180,180,6) In [16]: bins_y = linspace(-90,90,4) In [17]: data_x = linspace(-179.96875, 179.96875, 2880) In [18]: data_y = linspace(-89.96875, 89.96875, 2880) In [19]: H, x_edges, y_edges = np.histogram2d(data_x, data_y, (bins_x, bins_y)) In [20]: H Out[20]: array([[ 576., 0., 0.], [ 384., 192., 0.], [ 0., 576., 0.], [ 0., 192., 384.], [ 0., 0., 576.]]) In [21]: x_edges Out[21]: array([-180., -108., -36., 36., 108., 180.]) In [22]: y_edges Out[22]: array([-90., -30., 30., 90.]) So, back to that AttributeError - it is clearly unhelpful. Looking through the code, it looks like the x,y input arrays are joined into a 2D array with a numpy core function 'atleast_2d'. If this function sees inputs that are not the same length, it actually produces a 2-element numpy object array: In [57]: data_x.shape, data_y.shape Out[57]: ((5760,), (2880,)) In [58]: data_xy = atleast_2d([data_x, data_y]) In [59]: data_xy.shape, data_xy.dtype Out[59]: ((1, 2), dtype('object')) In [60]: data_xy[0,0].shape, data_xy[0,1].shape Out[60]: ((5760,), (2880,)) If the x, y array have the same length this looks a lot more logical: In [62]: data_x.shape, data_y.shape Out[62]: ((2880,), (2880,)) In [63]: data_xy = atleast_2d([data_x, data_y]) In [64]: data_xy.shape, data_xy.dtype Out[64]: ((2, 2880), dtype('float64')) So, that Assertion error comes up histogramdd (which actually does the work), expects the data array to be [Ndimension, Nsample], and the number of dimensions is set by the number of bin arrays that were input (2). Since it sees that [1,2] shaped object array, it treats that as a 2-element, 1-dimension dataset; thus, at that level, the AssertionError actually makes sense. Hope that helps, Aronne _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion