On Dec 15, 2010, at 5:35 AM, Matthew Mondor wrote: > On Tue, 14 Dec 2010 20:49:14 -0800 > Matt Thomas <m...@3am-software.com> wrote: > >> I have a fairly large but mostly simple patch which changes the stats >> collected in >> uvmexp for faults, intrs, softs, syscalls, and traps from 32 bit to 64 bits >> and >> puts them in cpu_data (in cpu_info). This makes more accurate and a little >> cheaper >> to update on 64bit systems. > > I like the cleanliness of the changes; > > A potential issue I see is how heavy this becomes on some 32-bit CPUs > i.e. m68k, where I see for instance 1 instruction being replaced by 9 > instructions (including registers save/restore) to increment a > counter. I'm not sure if in practice this will really affect > performance, or if it's worth benchmarking for those architectures, > however.
Here's the original assembly: 00000000 <orig>: 0: 52b9 0000 0000 addql #1,0 <orig> 6: 53b9 0000 0000 subql #1,0 <orig> c: If we put idepth in cpu_info, we can use the fact that &cpu_info_store.ci_data.cpu_nintr is an address register and use that to access ci_idepth it's only 8 bytes longer. 00000000 <lea_for_cpuinfo_nintr_plus_4_and_idepth>: 0: 41f9 0000 0000 lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0 6: 5290 addql #1,%a0@ 8: 4280 clrl %d0 a: 2220 movel %...@-,%d1 c: d380 addxl %d0,%d1 e: 2081 movel %d1,%a0@ 10: 53a8 004c subql #1,%a0@(76) 14: which saves two bytes over not doing that: 00000040 <lea_for_cpuinfo_nintr_plus_4>: 40: 41f9 0000 0000 lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0 46: 5290 addql #1,%a0@ 48: 4280 clrl %d0 4a: 2220 movel %...@-,%d1 4c: d380 addxl %d0,%d1 4e: 2081 movel %d1,%a0@ 50: 53b9 0000 0000 subql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth> 56: Now if we have the address register to point to cpu_info and have ci_idepth, it's a bit longer. 00000080 <lea_for_cpuinfo>: 80: 41f9 0000 0000 lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0 86: 52a8 00e4 addql #1,%a0@(228) 8a: 4280 clrl %d0 8c: 2228 00e0 movel %a0@(224),%d1 90: d380 addxl %d0,%d1 92: 2141 00e0 movel %d1,%a0@(224) 96: 53a8 012c subql #1,%a0@(300) 9a: and we don't use lea at all it's 16 bytes more than the original: 000000c0 <nolea>: c0: 52b9 0000 0000 addql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth> c6: 4280 clrl %d0 c8: 2239 0000 0000 movel 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%d1 ce: d380 addxl %d0,%d1 d0: 23c1 0000 0000 movel %d1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth> d6: 53b9 0000 0000 subql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth> dc: