On KernelTrap there is a story about Linux kernel memory allocation debugging patch that allows detection of reads from uninitialized memory (http://kerneltrap.org/Linux/Debugging_With_kmemcheck).
The patch takes a half of the memory and slows down the system. I think Qemu could be used instead. A channel (IO/MMIO) is created between the memory allocator in target kernel and Qemu running in the host. Memory allocator tells the allocated area to Qemu using the channel. Qemu changes the physical memory mapping for the area to special memory that will report any reads before writes back to allocator. Writes change the memory back to standard RAM. The performance would be comparable to Qemu in general and host kernel + Qemu only take a few MB of the memory. The system would be directly usable for other OSes as well. Similar debugging tool could be used in user space too (instrumenting libc malloc/free), but that's probably reinventing Valgrind or other malloc checkers. The special memory could also report unaligned accesses even on target where this is normally not detected but not so efficient.