On Mon, May 4, 2009 at 4:00 PM, Chris Colbert <sccolb...@gmail.com> wrote: > i'll take a look at them over the next few days and see what i can hack out. > > Chris > > On Mon, May 4, 2009 at 3:18 PM, David Huard <david.hu...@gmail.com> wrote: >> >> >> On Mon, May 4, 2009 at 7:00 AM, <josef.p...@gmail.com> wrote: >>> >>> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert <sccolb...@gmail.com> >>> wrote: >>> > this actually sort of worked. Thanks for putting me on the right track. >>> > >>> > Here is what I ended up with. >>> > >>> > this is what I ended up with: >>> > >>> > def hist3d(imgarray): >>> > histarray = N.zeros((16, 16, 16)) >>> > temp = imgarray.copy() >>> > bins = N.arange(0, 257, 16) >>> > histarray = N.histogramdd((temp[:,:,0].ravel(), >>> > temp[:,:,1].ravel(), >>> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] >>> > return histarray >>> > >>> > this creates a 3d histogram of rgb image values in the range 0,255 >>> > using 16 >>> > bins per component color. >>> > >>> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for >>> > loop. >>> > >>> > not quite framerate, but good enough for prototyping. >>> > >>> >>> I don't think your copy to temp is necessary, and use reshape(-1,3) as >>> in the example of Stefan, which will avoid copying the array 3 times. >>> >>> If you need to gain some more speed, then rewriting histogramdd and >>> removing some of the unnecessary checks and calculations looks >>> possible. >> >> Indeed, the strategy used in the histogram function is faster than the one >> used in the histogramdd case, so porting one to the other should speed >> things up. >> >> David
is searchsorted faster than digitize and bincount ? Using the idea of histogramdd, I get a bit below a tenth of a second, my best for this problem is below. I was trying for a while what the fastest way is to convert a two dimensional array into a one dimensional index for bincount. I found that using the return index of unique1d is very slow compared to numeric index calculation. Josef example timed for: nobs = 307200 nbins = 16 factors = np.random.randint(256,size=(nobs,3)).copy() factors2 = factors.reshape(-1,480,3).copy() def hist3(factorsin, nbins): if factorsin.ndim != 2: factors = factorsin.reshape(-1,factorsin.shape[-1]) else: factors = factorsin N, D = factors.shape darr = np.empty(factors.T.shape, dtype=int) nele = np.max(factors)+1 bins = np.arange(0, nele, nele/nbins) bins[-1] += 1 for i in range(D): darr[i] = np.digitize(factors[:,i],bins) - 1 #add weighted rows darrind = darr[D-1] for i in range(D-1): darrind += darr[i]*nbins**(D-i-1) return np.bincount(darrind) # return flat not reshaped _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion