Aha, I've found the problem -- my values were int64 and my keys were uint64. Switching to the same data type immediately fixes the issue! It's not a memory cache issue at all.
Perhaps searchsorted() should emit a warning if the keys require casting... I can't believe how bad the hit was. -Andrew Charles R Harris wrote: > > > On Wed, May 14, 2008 at 2:00 PM, Andrew Straw <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Charles R Harris wrote: > > > > > > On Wed, May 14, 2008 at 8:09 AM, Andrew Straw > <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> > > <mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>> wrote: > > > > > > > > Quite a difference (a factor of about 3000)! At this point, > I haven't > > delved into the dataset to see what makes it so pathological -- > > performance is nowhere near this bad for the binary search > algorithm > > with other sets of keys. > > > > > > It can't be that bad Andrew, something else is going on. And 191 MB > > isn's *that* big, I expect it should bit in memory with no problem. > I agree the performance difference seems beyond what one would expect > due to cache misses alone. I'm at a loss to propose other > explanations, > though. Ideas? > > > I just searched for 2**25/10 keys in a 2**25 array of reals. It took > less than a second when vectorized. In a python loop it took about 7.7 > seconds. The only thing I can think of is that the search isn't > getting any cpu cycles for some reason. How much memory is it using? > Do you have any nans and such in the data? _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion