def reduce_data(buffer, resolution): thinned_buffer = np.zeros((resolution**3, 3))
min_xyz = buffer.min(axis=0) max_xyz = buffer.max(axis=0) delta_xyz = max_xyz - min_xyz inds_xyz = np.floor(resolution * (buffer - min_xyz) / delta_xyz).astype(int) # handle values right at the max inds_xyz[inds_xyz == resolution] -= 1 # covert to linear indices so that we can use np.add.at inds_lin = inds_xyz[:,0] inds_lin += inds_xyz[:,1] * resolution inds_lin += inds_xyz[:,2] * resolution**2 np.add.at(thinned_buffer, inds_lin, buffer) counts = np.bincount(inds_lin, minlength=resolution**3) thinned_buffer[counts != 0, :] /= counts[counts != 0, None] return thinned_buffer The bulk of the time is spent in np.add.at, so just over 5 s here with your 1e7 to 1e6 example. On Tue, Apr 5, 2016 at 2:09 PM, mpc <matt.p.co...@gmail.com> wrote: > This wasn't intended to be a histogram, but you're right in that it would > be > much better if I can just go through each point once and bin the results, > that makes more sense, thanks! > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion