Hi Eat, I will surely try these routines tomorrow, but I still think that neither scipy function does the complete distance calculation of all possible pairs as done by my C code. For 2 arrays, X and Y, of nX and nY 2d coordinates respectively, I need to get nX times nY distances computed. >From the online documentation I understand that pdist calculates something like nX square distances for a single array X of nX coordinates, and cdist would calculate nX distances, where nX is required to equal nY.
Thanks for looking into this, Sebastian On Tue, Feb 15, 2011 at 9:11 PM, eat <e.antero.ta...@gmail.com> wrote: > Hi, > > On Tue, Feb 15, 2011 at 5:50 PM, Sebastian Haase <seb.ha...@gmail.com> > wrote: >> >> Hi, >> I assume that someone here could maybe help me, and I'm hoping it's >> not too much off topic. >> I have 2 arrays of 2d point coordinates and would like to calculate >> all pairwise distances as fast as possible. >> Going from Python/Numpy to a (Swigged) C extension already gave me a >> 55x speedup. >> (.9ms vs. 50ms for arrays of length 329 and 340). > > With my very modest machine (Intel Dual CPU E2200, 2.2 Ghz) utilizing > scipy.spatial.distance.pdist will take some 1.5 ms for such arrays. Even the > simple linear algebra based full matrix calculations can be done less than 5 > ms. > > My two cents, > eat >> >> I'm using gcc on Linux. >> Now I'm wondering if I could go even faster !? >> My hope that the compiler might automagically do some SSE2 >> optimization got disappointed. >> Since I have a 4 core CPU I thought OpenMP might be an option; >> I never used that, and after some playing around I managed to get >> (only) 50% slowdown(!) :-( >> >> My code in short is this: >> (My SWIG typemaps use obj_to_array_no_conversion() from numpy.i) >> -------<Ccode> ---------- >> void dists2d( >> double *a_ps, int nx1, int na, >> double *b_ps, int nx2, int nb, >> double *dist, int nx3, int ny3) throw (char*) >> { >> if(nx1 != 2) throw (char*) "a must be of shape (n,2)"; >> if(nx2 != 2) throw (char*) "b must be of shape (n,2)"; >> if(nx3 != nb || ny3 != na) throw (char*) "c must be of shape (na,nb)"; >> >> double *dist_; >> int i, j; >> >> #pragma omp parallel private(dist_, j, i) >> { >> #pragma omp for nowait >> for(i=0;i<na;i++) >> { >> //num_threads=omp_get_num_threads(); --> 4 >> dist_ = dist+i*nb; // dists_ is only >> introduced for OpenMP >> for(j=0;j<nb;j++) >> { >> *dist_++ = sqrt( sq(a_ps[i*nx1] - b_ps[j*nx2]) + >> sq(a_ps[i*nx1+1] >> - b_ps[j*nx2+1]) ); >> } >> } >> } >> } >> -------</Ccode> ---------- >> There is probably a simple mistake in this code - as I said I never >> used OpenMP before. >> It should be not too difficult to use OpenMP correctly here >> or - maybe better - >> is there a simple SSE(2,3,4) version that might be even better than >> OpenMP... !? >> >> I supposed, that I did not get the #pragma omp lines right - any idea ? >> Or is it in general not possible to speed this kind of code up using >> OpenMP !? >> >> Thanks, >> Sebastian Haase >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion