Re: [Qemu-devel] Kernel memory allocation debugging with Qemu
On Friday 08 February 2008, Blue Swirl wrote: > On 2/8/08, Paul Brook <[EMAIL PROTECTED]> wrote: > > > The patch takes a half of the memory and slows down the system. I > > > think Qemu could be used instead. A channel (IO/MMIO) is created > > > between the memory allocator in target kernel and Qemu running in the > > > host. Memory allocator tells the allocated area to Qemu using the > > > channel. Qemu changes the physical memory mapping for the area to > > > special memory that will report any reads before writes back to > > > allocator. Writes change the memory back to standard RAM. The > > > performance would be comparable to Qemu in general and host kernel + > > > Qemu only take a few MB of the memory. The system would be directly > > > usable for other OSes as well. > > > > The qemu implementation isn't actually any more space efficient than the > > in-kernel implementation. You still need the same amount of bookkeeping > > ram. In both cases it should be possible to reduce the overhead from 1/2 > > to 1/9 by using a bitmask rather than whole bytes. > > Qemu would not track all memory, only the regions that kmalloc() have > given to other kernel that have not yet been written to. Memory still has the be tracked after it has been written to. You can only stop tracking after the whole page has been written to, and there's no easy way to determine when that is. The kernel actually has better information about this because it can replace the clear/copy_page routines. If you're only trying to track things with page granularity then that's a much easier problem. > > Performance is a less clear. A qemu implementation probably causes less > > relative slowdown than an in-kernel implementation. However it's still > > going to be significantly slower than normal qemu. Remember that any > > checked access is going to have to go through the slow case in the TLB > > lookup. Any optimizations that are applicable to one implementation can > > probably also be applied to the other. > > Again, we are not trapping all accesses. The fast case should be used > for most kernel accesses and all of userland. Ok. So all the accesses that the in-kernel implementation intercepts. That's obviously a significant number. If it wasn't then performance wouldn't matter. The number of accesses intercepted and amount of bookkeeping required should be the same in both cases. The only difference is the runtime overhead when an access in intercepted. qemu goes through the slow-path softmmu routines, the in kernel implementation takes a pagefault+singlestep. Paul
Re: [Qemu-devel] Kernel memory allocation debugging with Qemu
On 2/8/08, Paul Brook <[EMAIL PROTECTED]> wrote: > > The patch takes a half of the memory and slows down the system. I > > think Qemu could be used instead. A channel (IO/MMIO) is created > > between the memory allocator in target kernel and Qemu running in the > > host. Memory allocator tells the allocated area to Qemu using the > > channel. Qemu changes the physical memory mapping for the area to > > special memory that will report any reads before writes back to > > allocator. Writes change the memory back to standard RAM. The > > performance would be comparable to Qemu in general and host kernel + > > Qemu only take a few MB of the memory. The system would be directly > > usable for other OSes as well. > > The qemu implementation isn't actually any more space efficient than the > in-kernel implementation. You still need the same amount of bookkeeping ram. > In both cases it should be possible to reduce the overhead from 1/2 to 1/9 by > using a bitmask rather than whole bytes. Qemu would not track all memory, only the regions that kmalloc() have given to other kernel that have not yet been written to. > Performance is a less clear. A qemu implementation probably causes less > relative slowdown than an in-kernel implementation. However it's still going > to be significantly slower than normal qemu. Remember that any checked > access is going to have to go through the slow case in the TLB lookup. Any > optimizations that are applicable to one implementation can probably also be > applied to the other. Again, we are not trapping all accesses. The fast case should be used for most kernel accesses and all of userland. > Given qemu is significantly slower to start with, and depending on the > overhead of taking the page fault, it might not end up much better overall. A > KVM implementation would most likely be slower than the in-kernel. > > That said it may be an interesting thing to play with. In practice it's > probably most useful to generate an interrupt and report back to the guest > OS, rather than having qemu reports faults directly. The access could happen when the interrupts are disabled, so a buffer should be needed. The accesses could also be written to a block device seen by both Qemu and the kernel, or appear to arrive from a fake network device.
Re: [Qemu-devel] Kernel memory allocation debugging with Qemu
> The patch takes a half of the memory and slows down the system. I > think Qemu could be used instead. A channel (IO/MMIO) is created > between the memory allocator in target kernel and Qemu running in the > host. Memory allocator tells the allocated area to Qemu using the > channel. Qemu changes the physical memory mapping for the area to > special memory that will report any reads before writes back to > allocator. Writes change the memory back to standard RAM. The > performance would be comparable to Qemu in general and host kernel + > Qemu only take a few MB of the memory. The system would be directly > usable for other OSes as well. The qemu implementation isn't actually any more space efficient than the in-kernel implementation. You still need the same amount of bookkeeping ram. In both cases it should be possible to reduce the overhead from 1/2 to 1/9 by using a bitmask rather than whole bytes. Performance is a less clear. A qemu implementation probably causes less relative slowdown than an in-kernel implementation. However it's still going to be significantly slower than normal qemu. Remember that any checked access is going to have to go through the slow case in the TLB lookup. Any optimizations that are applicable to one implementation can probably also be applied to the other. Given qemu is significantly slower to start with, and depending on the overhead of taking the page fault, it might not end up much better overall. A KVM implementation would most likely be slower than the in-kernel. That said it may be an interesting thing to play with. In practice it's probably most useful to generate an interrupt and report back to the guest OS, rather than having qemu reports faults directly. Paul
[Qemu-devel] Kernel memory allocation debugging with Qemu
On KernelTrap there is a story about Linux kernel memory allocation debugging patch that allows detection of reads from uninitialized memory (http://kerneltrap.org/Linux/Debugging_With_kmemcheck). The patch takes a half of the memory and slows down the system. I think Qemu could be used instead. A channel (IO/MMIO) is created between the memory allocator in target kernel and Qemu running in the host. Memory allocator tells the allocated area to Qemu using the channel. Qemu changes the physical memory mapping for the area to special memory that will report any reads before writes back to allocator. Writes change the memory back to standard RAM. The performance would be comparable to Qemu in general and host kernel + Qemu only take a few MB of the memory. The system would be directly usable for other OSes as well. Similar debugging tool could be used in user space too (instrumenting libc malloc/free), but that's probably reinventing Valgrind or other malloc checkers. The special memory could also report unaligned accesses even on target where this is normally not detected but not so efficient.