On Thu, Apr 18, 2024 at 8:31 PM Gernot Heiser via Devel <devel@sel4.systems> wrote: > > I’m not completely sure what you’re actually measuring here (actually I’m > totally unsure > what your setup really is, I can’t make sense of the above description in > terms of > what the seL4 system is doing). > But measuring performance of a kernel inside a virtual machine is somewhat > pointless at best. >
I don't think I've got any way to run both QNX and seL4 on the same hardware without virtualization. I wouldn't think it would make that much of a difference here. > > In any case it seems clear that you’re *not* measuring raw seL4 IPC. > Your quoted cost of a round-trip on seL4 is say 120ms for 10,000 round trips, > or 12µs for one round trip. You don’t specify the clock rate, but that would > be > somewhere between 12,000 and 50,000 cycles, when a round-trip IPC on x86 with > MCS is <900 cycles I'm running on a Core i7-960, which is 3.2GHz. > > Note that using Send()/Recv()/Reply() is the wrong way to use seL4 IPC, see > https://microkerneldude.org/2019/03/07/how-to-and-how-not-to-use-sel4-ipc/ > However, this will only be about a factor two slower than using the correct > Call()/ReplyWait() incantation. > Somehow I said Send()/Recv()/Reply() when I meant Call()/Recv()/Send() (MCS of course doesn't even have a distinct Reply() function). Here's the exact code for the benchmark loop I'm using on seL4: void c_test_client(seL4_CPtr endpoint) { struct seL4_MessageInfo info = seL4_MessageInfo_new(0, 0, 0, 2); int j; for (j = 0; j < 10000; j++){ seL4_Call(endpoint, info); } } void c_test_server(seL4_CPtr endpoint, seL4_CPtr reply) { struct seL4_MessageInfo info = seL4_MessageInfo_new(0, 0, 0, 2); while (1){ seL4_Recv(endpoint, (void *)0, reply); seL4_Send(reply, info); } } This is being called from a Rust wrapper that allocates the endpoint/reply and reads the TSC; the inner part of it looks like this: let start = x86_64::_rdtsc(); x86_64::__cpuid(0); c_test_client(endpoint.to_cap()); let end = x86_64::_rdtsc(); x86_64::__cpuid(0); info!("test finished: {} {} {}", start, end, end - start); I'm not quite sure what would be slowing it down. I would have thought this would be measuring only the cost of a round trip IPC and nothing more. Is there possibly some kind of scheduling issue adding overhead? The QNX/Linux equivalent is: int main() { char buf[100]; int f = open("/dev/null", O_WRONLY); int i; for (i = 0; i < 10; i++) { uint64_t start = __rdtsc(); uint64_t end; int j; for (j = 0; j < 10000; j++){ if (write(f, buf, 100) != 100){ printf("cannot write\n"); exit(1); } } end = __rdtsc(); printf("cycles: %u\n", end - start); } } _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems