Fastpath on x32: Every single cycle

Moritz Kroll Mon, 23 Feb 2009 12:24:12 -0800

I don't understand, why on the receive side MR_1 and MR_2 are loaded intoregisters (EBX and EBP). In many cases they may be unused and even if theywould be used, they could easily be fetched by the user (especially as theUTCB pointer is already available in EDI).You may argue, that MR_0 is loaded from the UTCB anyway, so MR_1 and MR_2are already in the cache. But why has MR_0 to be loaded, if it is initiallyin ESI?

Here's a suggestion what can be done (without small spaces) to keep ESIuntouched (l4ka-pistachio-38b2e96fc5a6\kernel\src\glue\v4-x86\x32\trap.S):1. Use "test $0x3f, %esi" instead of "and $0x3f, %esi; test %esi, %esi" (onebyte more, but doesn't change ESI)

2. Use EDX instead of ESI for "dest" starting from label 3

3. Move "popl %ecx" to label 7 (should be unaffected by page table switch,makes ECX free to use)4. Use "movl %cr3, %ecx; cmpl %eax, %ecx" instead of "movl %cr3, %edx; cmpl%eax, %edx" (don't use EDX anymore)

5. Remove all "load MRs" lines (9 bytes and three memory accesses less)

If you can also spare the copy of MR_0 to the destination UTCB, I think thiswould also save you one cache line, if no untyped words are transmitted.According to the L4 X.2 reference manual MR_0 is not mapped to memory,anyway.


What do you think, does this make sense?

It would require a change of the (not stable) specification, but I don't seeany reason for MR_1 and MR_2 to be received in registers on x32.Btw, why are MR_1 and MR_2 stored in the UTCB again in L4_Ipc, when theyhave just been loaded from there?


Best regards

Moritz

Fastpath on x32: Every single cycle

Reply via email to