Re: [PyCUDA] Histograms with PyCUDA

Francisco Villaescusa Navarro Wed, 04 Apr 2012 13:50:02 -0700

Thanks a lot for the replies!

I'm not sure to fully understand what you say, so please, let me sayit with my own words (if I'm wrong please let me know):

I transfer the array with the numbers I want to grid to the GPU. Overeach element of that array I overwrite the value of the bin thatcorresponds to that array's element and I return that array(containing integer numbers with the positions of the bins) to the CPUwhere I make the reduction.


Fran.


El 04/04/2012, a las 22:34, Pazzula, Dominic J escribió:

Exactly what I was about to propose. Doing the reduction wouldprobably be faster on the CPU. NumPy + MKL would thread what isessentially a series of element-wise array additions.
From: pycuda-boun...@tiker.net [mailto:pycuda-boun...@tiker.net] OnBehalf Of David Mertens
Sent: Wednesday, April 04, 2012 3:27 PM
To: Francisco Villaescusa Navarro
Cc: pycuda@tiker.net
Subject: Re: [PyCUDA] Histograms with PyCUDA
You could take a two-step approach: (1) Have each thread work on asubset of the data and create its own histogram, then (2) run theresulting collection of bins through a sum/reduction kernel. Thetime for the first step would be roughly N_data /N_simultaneous_blocks_per_device, and the running time for thesecond step would be roughly N_blocks * log(N_blocks).
Not a very detailed answer, but I hope that helps. :-)

David
On Wed, Apr 4, 2012 at 2:07 PM, Francisco Villaescusa Navarro <villaescusa.franci...@gmail.com> wrote:
Thanks a lot for the quick reply!
I was wondering whether I could "count" the number of elements in agiven interval by something such as:
moduleHistrogram = SourceModule("""
__global__ void H(float *pos, int size, float his, floatlower_limit, float upper_limit)
{
   unsigned int idx = blockIdx.x*blockDim.x+threadIdx.x;
   unsigned int idy = blockIdx.y*blockDim.y+threadIdx.y;
   unsigned int id = idy*gridDim.x*blockDim.x+idx;

   if (id<size) {
       if (pos[id]<upper_limit && pos[id]>lower_limit){
               his=his+1.0;
       }
   }
}
""")
I have tried this but it doesn't work (because the value of thevariable his is not "viewed" by different threads, each of them hasits own local value for the variable his. I also tried with thekernel:
moduleHistrogram = SourceModule("""
__global__ void H(float *pos, int size, float his, floatlower_limit, float upper_limit)
{
   unsigned int idx = blockIdx.x*blockDim.x+threadIdx.x;
   unsigned int idy = blockIdx.y*blockDim.y+threadIdx.y;
   unsigned int id = idy*gridDim.x*blockDim.x+idx;

   __shared__ float A;
   A=his;
  __syncthreads();

   if (id<size) {
       if (pos[id]<upper_limit && pos[id]>lower_limit){
               A=A+1.0;
               __syncthreads();
               his=A;
       }
   }
}
""")

but the problem isn't solved.
Probably I'm doing something very stupid and I would like to knowwhat it is.
Thanks a lot,

Fran.

El 04/04/2012, a las 20:32, Andreas Kloeckner escribió:


<#part sign=pgpmime>
On Wed, 4 Apr 2012 19:47:08 +0200, Francisco Villaescusa Navarro <villaescusa.franci...@gmail.com> wrote:
Hi,

I have been writing some lines for a project regarding management of
pretty large data sets. I have been trying to simplify the problem as
much as possible to understand where the problem is since I got wrong
results.

The simplification of the problem is the following:

I have a pretty long array of data containing numbers in a given
interval (let's suppose between 0.0 and 1.0), for example

total_numbers=10000
np.random.random(total_numbers).astype(np.float32)

I would like make a histogram of those data. I was wondering which
would be the best strategy to achieve this in PyCUDA.

http://lmgtfy.com/?q=cuda+histogram

:) (Nothing special about *Py*CUDA in this instance. In particular,
there's no canned functionality that will do this for you.)

HTH,
Andreas


_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda



--
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Histograms with PyCUDA

Reply via email to