I've been looking at some quite weird behaviour with mmapped files on ffs. I want to concentrate on something else for a while, so here's a brain dump of what I've been struggling with recently, in case it rings a bell for someone or they even know the solution.
Background: The shmif rump driver provides a networkin backend using the old mmap-a-file-to-get-a-handle trick. Observations: Most of the time the problem is that the first 16k of the bus file gets corrupted. The underlying fs blocksize is 32k. I have verified that: a) it does not get written to by the involved processes per ktrace -i b) processes do not overwrite random memory by having a PROT_NONE red zone in front This problem does not happen on tmpfs. I don't believe there is a timing issue because I've run the test tens of thousands of times with varying background load. Zero-filling the bus file with write() instead of creating a sparse with truncate doesn't make much of a difference either. I was almost sure it was a problem with the genfs "sawhole" code, but nope. Usually after the bus has seen one generation (i.e. the pages have been faulted in to all processes) there are no further problems. However, causing (read) faults from a 3rd party process not involved with the test may trigger the problem. The really spooky stuff: Seems like it's possible to get two "views" into the same file depending on read/write or mmap access (whatever happened to mr. ubc???). Can someone explain this: > ./dumpbus-mmio -h thank-you-driver-for-getting-me-here bus version 2, lock: 0, generation: 431, firstoff: 0x5a95a, lastoff: 0x5a8ea > ./dumpbus-read -h thank-you-driver-for-getting-me-here dumpbus-read: thank-you-driver-for-getting-me-here not a shmif bus i.e. same file, but "magic" number doesn't match when not using mmap. hexdump uses read() (per ktrace), so I get the "garbage" version of the file with it and can confirm it indeed has gargabe in it. The only difference between the two programs is this: #if 1 read(fd, buf, BUFSIZE); bmem = (void *)buf; #else busmem = mmap(NULL, sb.st_size, PROT_READ, MAP_FILE|MAP_SHARED, fd, 0); if (busmem == MAP_FAILED) err(1, "mmap"); bmem = busmem; #endif However, I can restore the old version using cp (since it uses mmio): > ./dumpbus-read -h thank-you-driver-for-getting-me-here dumpbus-read: thank-you-driver-for-getting-me-here not a shmif bus > cp thank-you-driver-for-getting-me-here backup > ./dumpbus-read -h backup bus version 2, lock: 0, generation: 431, firstoff: 0x5a95a, lastoff: 0x5a8ea How-to-repeat: Get tests/net/icmp from -current and run "./t_ping floodping" in a loop from ffs. You should see the problem within a few thousand iterations. Most likely the shmif code will encounter an invariant failure, such as: panic: kernel diagnostic assertion "busmem->shm_magic == SHMIF_MAGIC" failed: file "if_shmem.c", line 391 I plan to update to latest -STABLE soon and see if the problem is still present there. Guess I'll reboot now...