On Nov 12, 11:32 am, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > See end for solution. > > > >> (3) Are you sure you need all eight-million-plus items in the cache > > >> all at once? > > > > Yes. > > > I remain skeptical, but what do I know, I don't even know what you're > > doing with the data once you have it :-) > > It's OK, I'd be skeptical too. ;) > > > > > > > $ cat /proc/cpuinfo > > processor : 0 > > vendor_id : AuthenticAMD > > cpu family : 15 > > model : 107 > > model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ > > stepping : 1 > > cpu MHz : 1000.000 > > cache size : 512 KB > > ... > > processor : 1 > > vendor_id : AuthenticAMD > > cpu family : 15 > > model : 107 > > model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ > > stepping : 1 > > cpu MHz : 1000.000 > > cache size : 512 KB > > # cat /proc/cpuinfo > processor : 0 > vendor_id : AuthenticAMD > cpu family : 15 > model : 5 > model name : AMD Opteron(tm) Processor 246 > stepping : 10 > cpu MHz : 2009.305 > cache size : 1024 KB > fpu : yes > fpu_exception : yes > cpuid level : 1 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow > bogomips : 3981.31 > TLB size : 1088 4K pages > clflush size : 64 > cache_alignment : 64 > address sizes : 40 bits physical, 48 bits virtual > power management: ts fid vid ttp > > processor : 1 > vendor_id : AuthenticAMD > cpu family : 15 > model : 5 > model name : AMD Opteron(tm) Processor 246 > stepping : 10 > cpu MHz : 2009.305 > cache size : 1024 KB > fpu : yes > fpu_exception : yes > cpuid level : 1 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow > bogomips : 4014.08 > TLB size : 1088 4K pages > clflush size : 64 > cache_alignment : 64 > address sizes : 40 bits physical, 48 bits virtual > power management: ts fid vid ttp > > As for the solution, after trying a half-dozen different integer hashing > functions > and hash table sizes (the brute force approach), on a total whim I switched > to a > model with two dictionary tiers and got whole orders of magnitude > better performance. > > The tiering is, for a given key of type long: > > id2name[key >> 40][key & 0x10000000000] = name > > Much, much better. A few minutes versus hours this way. > > I suspect it could be brought down to seconds with a third level of > tiers but this is no longer posing the biggest bottleneck... ;) > > Thanks!- Hide quoted text - > > - Show quoted text -
Shouldn't this be: id2name[key >> 40][key & 0xffffffffff] = name ? -- Paul -- http://mail.python.org/mailman/listinfo/python-list