Found the issue: during resolution change in Windows 7 it happens sometimes that it changes to an intermediate resolution where server_stride % cmp_bytes != 0. The problem that causes memory corruption is where the guest fb is copied to the server fb. It can easily be fixed truncating cmp_bytes in vnc_refresh_server_surface. But by looking at the code it seems that none of the encoders called in vnc_send_framebuffer_update really care about w > pixman_image_get_width(vd->server). I will send a patch that will remove all DIV_ROUND_UPs for now to avoid corruption. There are really almost no real resultions out there where width % 16 != 0. If we find some we might need to either decrease VNC_DIRTY_PIXELS_PER_BIT or make it dynamic depending on the resolution.
Peter Am 26.06.2014 17:44, schrieb Peter Lieven: > Hi all, > > while playing around with the vmware vga driver I noticed that there seems > to be a race condition when the resolution is changed. I was able to trigger > this also with std vga. Attached valgrind produces always an output similar > to this: > > ==3346== Thread 1: > ==3346== Invalid read of size 8 > ==3346== at 0x4C2D108: memcpy@@GLIBC_2.14 (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) > ==3346== by 0x400F19: vnc_refresh (vnc.c:2753) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== by 0x3D6D93: gui_update (console.c:194) > ==3346== by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) > ==3346== by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) > ==3346== by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) > ==3346== by 0x3649CF: main_loop_wait (main-loop.c:490) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > ==3346== Address 0x12555180 is not stack'd, malloc'd or (recently) free'd > ==3346== > ==3346== Invalid write of size 8 > ==3346== at 0x4C2D10D: memcpy@@GLIBC_2.14 (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) > ==3346== by 0x400F19: vnc_refresh (vnc.c:2753) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== by 0x3D6D93: gui_update (console.c:194) > ==3346== by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) > ==3346== by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) > ==3346== by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) > ==3346== by 0x3649CF: main_loop_wait (main-loop.c:490) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > ==3346== Address 0x15731080 is not stack'd, malloc'd or (recently) free'd > ==3346== > ==3346== Invalid read of size 8 > ==3346== at 0x4C2D11A: memcpy@@GLIBC_2.14 (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) > ==3346== by 0x400F19: vnc_refresh (vnc.c:2753) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== by 0x3D6D93: gui_update (console.c:194) > ==3346== by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) > ==3346== by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) > ==3346== by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) > ==3346== by 0x3649CF: main_loop_wait (main-loop.c:490) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > ==3346== Address 0x12555170 is not stack'd, malloc'd or (recently) free'd > ==3346== > ==3346== Invalid read of size 1 > ==3346== at 0x4C2DCC0: bcmp (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x400D91: vnc_refresh_server_surface (vnc.c:2720) > ==3346== by 0x400F19: vnc_refresh (vnc.c:2753) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== by 0x3D6D93: gui_update (console.c:194) > ==3346== by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) > ==3346== by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) > ==3346== by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) > ==3346== by 0x3649CF: main_loop_wait (main-loop.c:490) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > ==3346== Address 0x15731050 is 0 bytes after a block of size 196,560 alloc'd > ==3346== at 0x4C29DB4: calloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x70C8B1A: ??? (in > /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) > ==3346== by 0x70C8BF4: ??? (in > /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) > ==3346== by 0x3FAECC: vnc_dpy_switch (vnc.c:590) > ==3346== by 0x3DA87C: dpy_gfx_replace_surface (console.c:1404) > ==3346== by 0x3DBCF0: qemu_console_resize (console.c:1857) > ==3346== by 0x450A39: vga_draw_text (vga.c:1344) > ==3346== by 0x4521B0: vga_update_display (vga.c:1910) > ==3346== by 0x2A665B: vmsvga_update_display (vmware_vga.c:1071) > ==3346== by 0x3D7087: graphic_hw_update (console.c:256) > ==3346== by 0x400EE3: vnc_refresh (vnc.c:2746) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== > ==3346== Invalid read of size 1 > ==3346== at 0x4C2DCC6: bcmp (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x400D91: vnc_refresh_server_surface (vnc.c:2720) > ==3346== by 0x400F19: vnc_refresh (vnc.c:2753) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== by 0x3D6D93: gui_update (console.c:194) > ==3346== by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) > ==3346== by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) > ==3346== by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) > ==3346== by 0x3649CF: main_loop_wait (main-loop.c:490) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > ==3346== Address 0x12555150 is 0 bytes after a block of size 196,560 alloc'd > ==3346== at 0x4C29DB4: calloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x70C8B1A: ??? (in > /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) > ==3346== by 0x70C8BF4: ??? (in > /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) > ==3346== by 0x3D9F22: qemu_alloc_display (console.c:1224) > ==3346== by 0x3DA017: qemu_create_displaysurface (console.c:1241) > ==3346== by 0x3DBCD9: qemu_console_resize (console.c:1856) > ==3346== by 0x450A39: vga_draw_text (vga.c:1344) > ==3346== by 0x4521B0: vga_update_display (vga.c:1910) > ==3346== by 0x2A665B: vmsvga_update_display (vmware_vga.c:1071) > ==3346== by 0x3D7087: graphic_hw_update (console.c:256) > ==3346== by 0x400EE3: vnc_refresh (vnc.c:2746) > ==3346== by 0x3DA903: dpy_refresh (console.c:1416) > ==3346== > > valgrind: m_mallocfree.c:288 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' > failed. > valgrind: Heap block lo/hi size mismatch: lo = 704145, hi = 0. > This is probably caused by your program erroneously writing past the > end of a heap block and corrupting heap metadata. If you fix any > invalid writes reported by Memcheck, this assertion failure will > probably go away. Please try that before reporting this as a bug. > > ==3346== at 0x3804CA36: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x3804CBDC: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x38057FB0: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x38058F6E: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x3802144C: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x38021A80: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x38021C6A: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x380902A7: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > ==3346== by 0x3809F7D5: ??? (in /usr/lib/valgrind/memcheck-amd64-linux) > > sched status: > running_tid=1 > > Thread 1: status = VgTs_Runnable > ==3346== at 0x4C2B6CD: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==3346== by 0x409CE2: malloc_and_trace (vl.c:2845) > ==3346== by 0x54C1A38: g_malloc (in > /lib/x86_64-linux-gnu/libglib-2.0.so.0.3200.4) > ==3346== by 0x444761: virtio_blk_alloc_request (virtio-blk.c:107) > ==3346== by 0x4447CF: virtio_blk_get_request (virtio-blk.c:116) > ==3346== by 0x4453BD: virtio_blk_handle_output (virtio-blk.c:412) > ==3346== by 0x49070F: virtio_queue_notify_vq (virtio.c:723) > ==3346== by 0x4921C9: virtio_queue_host_notifier_read (virtio.c:1119) > ==3346== by 0x3638FF: qemu_iohandler_poll (iohandler.c:143) > ==3346== by 0x3649B1: main_loop_wait (main-loop.c:485) > ==3346== by 0x406540: main_loop (vl.c:2051) > ==3346== by 0x40DEA0: main (vl.c:4507) > > Thread 2: status = VgTs_WaitSys > ==3346== at 0x7B58C67: ioctl (syscall-template.S:82) > ==3346== by 0x497C5F: kvm_vcpu_ioctl (kvm-all.c:1790) > ==3346== by 0x497658: kvm_cpu_exec (kvm-all.c:1675) > ==3346== by 0x416EAE: qemu_kvm_cpu_thread_fn (cpus.c:873) > ==3346== by 0x7856E99: start_thread (pthread_create.c:308) > ==3346== by 0x7B603FC: clone (clone.S:112) > > Thread 3: status = VgTs_WaitSys > ==3346== at 0x7B58C67: ioctl (syscall-template.S:82) > ==3346== by 0x497C5F: kvm_vcpu_ioctl (kvm-all.c:1790) > ==3346== by 0x497658: kvm_cpu_exec (kvm-all.c:1675) > ==3346== by 0x416EAE: qemu_kvm_cpu_thread_fn (cpus.c:873) > ==3346== by 0x7856E99: start_thread (pthread_create.c:308) > ==3346== by 0x7B603FC: clone (clone.S:112) > > Thread 4: status = VgTs_WaitSys > ==3346== at 0x785AD84: pthread_cond_wait@@GLIBC_2.3.2 > (pthread_cond_wait.S:162) > ==3346== by 0x54887D: qemu_cond_wait (qemu-thread-posix.c:135) > ==3346== by 0x3F691D: vnc_worker_thread_loop (vnc-jobs.c:222) > ==3346== by 0x3F6E80: vnc_worker_thread (vnc-jobs.c:323) > ==3346== by 0x7856E99: start_thread (pthread_create.c:308) > ==3346== by 0x7B603FC: clone (clone.S:112) > > > I tried if a lock around vnc_dpy_switch helps because I was thinking that > vnc_refresh_server_surface > was running while vnc_dpy_switch was triggered, but it seemed not to help. > > Any ideas? > > Peter