I don't understand, why on the receive side MR_1 and MR_2 are loaded into registers (EBX and EBP). In many cases they may be unused and even if they would be used, they could easily be fetched by the user (especially as the UTCB pointer is already available in EDI). You may argue, that MR_0 is loaded from the UTCB anyway, so MR_1 and MR_2 are already in the cache. But why has MR_0 to be loaded, if it is initially in ESI?

Here's a suggestion what can be done (without small spaces) to keep ESI untouched (l4ka-pistachio-38b2e96fc5a6\kernel\src\glue\v4-x86\x32\trap.S): 1. Use "test $0x3f, %esi" instead of "and $0x3f, %esi; test %esi, %esi" (one byte more, but doesn't change ESI)
2. Use EDX instead of ESI for "dest" starting from label 3
3. Move "popl %ecx" to label 7 (should be unaffected by page table switch, makes ECX free to use) 4. Use "movl %cr3, %ecx; cmpl %eax, %ecx" instead of "movl %cr3, %edx; cmpl %eax, %edx" (don't use EDX anymore)
5. Remove all "load MRs" lines (9 bytes and three memory accesses less)

If you can also spare the copy of MR_0 to the destination UTCB, I think this would also save you one cache line, if no untyped words are transmitted. According to the L4 X.2 reference manual MR_0 is not mapped to memory, anyway.

What do you think, does this make sense?
It would require a change of the (not stable) specification, but I don't see any reason for MR_1 and MR_2 to be received in registers on x32. Btw, why are MR_1 and MR_2 stored in the UTCB again in L4_Ipc, when they have just been loaded from there?

Best regards
Moritz

Reply via email to