Hello, I use numpy.histogramdd to compute three dimensional histograms with a total number of bins in the order of 1e7. It is clear to me, that such a histogram will take a lot of memory. For a dtype=N.float64, it will take roughly 80 megabytes. However, I have the feeling that during the histogram calculation, much more memory is needed. For example, when I have data.shape = (8e6, 3) and do a numpy.histogramdd(d, 280), I expect a histogram size of (280**3)*8 = 176 megabytes, but during histogram calculation the memory need of pythonw.exe in the Windows Task Manager increases up to 687 megabytes over the level before histogram calculation. When the calculation is done, the mem usage drops down to the expected value. I assume this is due to the internal way, numpy.histogramdd works. However, when I need to calculate even bigger histograms, I cannot do it this way. So I have the following questions:
1) How can I tell histogramdd to use another dtype than float64? My bins will be very little populated so an int16 should be sufficient. Without normalization, a Integer dtype makes more sense to me. 2) Is there a way to use another algorithm (at the cost of performance) that uses less memory during calculation so that I can generate bigger histograms? My numpy version is '1.0.4.dev3937' Thanks, Lars -- Dipl.-Ing. Lars Friedrich Photonic Measurement Technology Department of Microsystems Engineering -- IMTEK University of Freiburg Georges-Köhler-Allee 102 D-79110 Freiburg Germany phone: +49-761-203-7531 fax: +49-761-203-7537 room: 01 088 email: [EMAIL PROTECTED] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion