Re: [Qemu-devel] qemu 1.6.1
Am 27.10.2013 07:54, schrieb Paolo Bonzini: > > I think the problem is that 0x20(%esp) gets somehow corrupted at the > instruction I highlighted with **. > > The simplest fix then would be to add a barrier() before and after > SwitchToFiber. > > Paolo I added some debugging output (see code at http://qemu.weilnetz.de/test/coroutine-win32/). The result with pointer values replaced by function names is shown below. There are two threads (WinMain, qemu_tcg_cpu_thread_fn) and one coroutine (bdrv_rw_co_entry). Stefan qemu_coroutine_self:WinMain qemu_in_coroutine:WinMain,null qemu_coroutine_new:bdrv_rw_co_entry qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain coroutine_swap(WinMain,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=WinMain coroutine_swap(bdrv_rw_co_entry,WinMain) qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain coroutine_swap(WinMain,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,WinMain # active coroutine bdrv_rw_co_entry is deleted coroutine_delete(bdrv_rw_co_entry) qemu_coroutine_self:qemu_tcg_cpu_thread_fn qemu_in_coroutine:qemu_tcg_cpu_thread_fn,null qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=qemu_tcg_cpu_thread_fn coroutine_swap(bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn) qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn # active coroutine bdrv_rw_co_entry is deleted coroutine_delete(bdrv_rw_co_entry) # now we are still in deleted coroutine bdrv_rw_co_entry qemu_in_coroutine:bdrv_rw_co_entry,null qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=bdrv_rw_co_entry coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=bdrv_rw_co_entry coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,null
Re: [Qemu-devel] qemu 1.6.1
Am 27.10.2013 07:54, schrieb Paolo Bonzini: > Here is the code with annotations > > broken works > - > push %ebx > sub$0x18,%espsub$0x1c,%esp > mov%ebx,0x14(%esp) > mov%esi,0x18(%esp) > > movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp) > mov0x24(%esp),%ebx mov0x24(%esp),%ebx > ebx = to; > call ___emutls_get_address call ___emutls_get_address > eax = ¤t; > > mov(%eax),%esi > esi = current; > > mov%ebx,(%eax) mov%ebx,(%eax) > current = to; > > mov0x28(%esp),%eax mov0x28(%esp),%eax > eax = action > mov%eax,0x24(%ebx) mov%eax,0x24(%ebx) > to->action = action > mov0x20(%ebx),%eax mov0x20(%ebx),%eax > eax = to->fiber > mov%eax,(%esp) mov%eax,(%esp) > "push" to->fiber > call *0x835fc0 call *0x835fc0 > SwitchToFiber(to->fiber) > sub$0x4,%esp sub$0x4,%esp > undo PASCAL calling convention > > ** mov0x20(%esp),%eax > eax = from > mov0x24(%eax),%eax mov0x24(%esi),%eax > eax = from->action > > mov0x14(%esp),%ebx > mov0x18(%esp),%esi > add$0x18,%espadd$0x1c,%esp > pop%ebx > ret ret > > > I think the problem is that 0x20(%esp) gets somehow corrupted at the > instruction I highlighted with **. > > The simplest fix then would be to add a barrier() before and after > SwitchToFiber. > > Paolo I tried adding two barrier() statements around SwitchToFiber(). That change did not result in different assembler code (=> unchanged behaviour, QEMU still raises an assertion). Stefan
Re: [Qemu-devel] qemu 1.6.1
Il 26/10/2013 11:51, Stefan Weil ha scritto: > Am 24.10.2013 23:47, schrieb Paolo Bonzini: >> Il 24/10/2013 17:37, Stefan Weil ha scritto: >>> Yes, that works, too. It also fixes the problem with the assertion >>> (tested with Wine). >>> >>> No, we cannot remove from_, because the same interface is also used >>> for Linux and other hosts which don't have a 'current' variable. >>> Or we would have to call qemu_coroutine_self() to get the current >>> coroutine. >> Yes, I was thinking of using qemu_coroutine_self(). >> >> By the way, can you post the two assembly language outputs for just >> >> - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); >> + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); >> >> which AIUI works and is enough to fix the bug? >> >> Paolo > > See disassembled code below. I removed compiler option -fstack-protector-all > to simplify the assembler code and tested that the result was not affected > by this removal. > > The C and assembler code from the test is also available at > http://qemu.weilnetz.de/test/coroutine-win32/. Here is the code with annotations broken works - push %ebx sub$0x18,%espsub$0x1c,%esp mov%ebx,0x14(%esp) mov%esi,0x18(%esp) movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp) mov0x24(%esp),%ebx mov0x24(%esp),%ebx ebx = to; call ___emutls_get_address call ___emutls_get_address eax = ¤t; mov(%eax),%esi esi = current; mov%ebx,(%eax) mov%ebx,(%eax) current = to; mov0x28(%esp),%eax mov0x28(%esp),%eax eax = action mov%eax,0x24(%ebx) mov%eax,0x24(%ebx) to->action = action mov0x20(%ebx),%eax mov0x20(%ebx),%eax eax = to->fiber mov%eax,(%esp) mov%eax,(%esp) "push" to->fiber call *0x835fc0 call *0x835fc0 SwitchToFiber(to->fiber) sub$0x4,%esp sub$0x4,%esp undo PASCAL calling convention ** mov0x20(%esp),%eax eax = from mov0x24(%eax),%eax mov0x24(%esi),%eax eax = from->action mov0x14(%esp),%ebx mov0x18(%esp),%esi add$0x18,%espadd$0x1c,%esp pop%ebx ret ret I think the problem is that 0x20(%esp) gets somehow corrupted at the instruction I highlighted with **. The simplest fix then would be to add a barrier() before and after SwitchToFiber. Paolo
Re: [Qemu-devel] qemu 1.6.1
Am 24.10.2013 23:47, schrieb Paolo Bonzini: > Il 24/10/2013 17:37, Stefan Weil ha scritto: >> Yes, that works, too. It also fixes the problem with the assertion >> (tested with Wine). >> >> No, we cannot remove from_, because the same interface is also used >> for Linux and other hosts which don't have a 'current' variable. >> Or we would have to call qemu_coroutine_self() to get the current >> coroutine. > Yes, I was thinking of using qemu_coroutine_self(). > > By the way, can you post the two assembly language outputs for just > > - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); > + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); > > which AIUI works and is enough to fix the bug? > > Paolo See disassembled code below. I removed compiler option -fstack-protector-all to simplify the assembler code and tested that the result was not affected by this removal. The C and assembler code from the test is also available at http://qemu.weilnetz.de/test/coroutine-win32/. Stefan unpatched QEMU, crash with assertion 00448670 <_qemu_coroutine_switch>: 448670: 53 push %ebx 448671: 83 ec 18sub$0x18,%esp 448674: c7 04 24 a8 62 6d 00movl $0x6d62a8,(%esp) 44867b: 8b 5c 24 24 mov0x24(%esp),%ebx 44867f: e8 ec 9e 27 00 call 6c2570 <___emutls_get_address> 448684: 89 18 mov%ebx,(%eax) 448686: 8b 44 24 28 mov0x28(%esp),%eax 44868a: 89 43 24mov%eax,0x24(%ebx) 44868d: 8b 43 20mov0x20(%ebx),%eax 448690: 89 04 24mov%eax,(%esp) 448693: ff 15 c0 5f 83 00 call *0x835fc0 448699: 83 ec 04sub$0x4,%esp 44869c: 8b 44 24 20 mov0x20(%esp),%eax 4486a0: 8b 40 24mov0x24(%eax),%eax 4486a3: 83 c4 18add$0x18,%esp 4486a6: 5b pop%ebx 4486a7: c3 ret patched, works 00448620 <_qemu_coroutine_switch>: 448620: 83 ec 1csub$0x1c,%esp 448623: c7 04 24 a8 62 6d 00movl $0x6d62a8,(%esp) 44862a: 89 5c 24 14 mov%ebx,0x14(%esp) 44862e: 8b 5c 24 24 mov0x24(%esp),%ebx 448632: 89 74 24 18 mov%esi,0x18(%esp) 448636: e8 25 9f 27 00 call 6c2560 <___emutls_get_address> 44863b: 8b 30 mov(%eax),%esi 44863d: 89 18 mov%ebx,(%eax) 44863f: 8b 44 24 28 mov0x28(%esp),%eax 448643: 89 43 24mov%eax,0x24(%ebx) 448646: 8b 43 20mov0x20(%ebx),%eax 448649: 89 04 24mov%eax,(%esp) 44864c: ff 15 c0 5f 83 00 call *0x835fc0 448652: 8b 46 24mov0x24(%esi),%eax 448655: 83 ec 04sub$0x4,%esp 448658: 8b 5c 24 14 mov0x14(%esp),%ebx 44865c: 8b 74 24 18 mov0x18(%esp),%esi 448660: 83 c4 1cadd$0x1c,%esp 448663: c3 ret
Re: [Qemu-devel] qemu 1.6.1
Il 24/10/2013 17:37, Stefan Weil ha scritto: > Yes, that works, too. It also fixes the problem with the assertion > (tested with Wine). > > No, we cannot remove from_, because the same interface is also used > for Linux and other hosts which don't have a 'current' variable. > Or we would have to call qemu_coroutine_self() to get the current > coroutine. Yes, I was thinking of using qemu_coroutine_self(). By the way, can you post the two assembly language outputs for just - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); which AIUI works and is enough to fix the bug? Paolo
Re: [Qemu-devel] qemu 1.6.1
Am 24.10.2013 12:38, schrieb Paolo Bonzini: > Il 23/10/2013 21:26, Stefan Weil ha scritto: >> Am 23.10.2013 11:00, schrieb Paolo Bonzini: >>> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: Hi, My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when booting from an install CD. C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. No further info at this stage. Any ideas? >>> It's a known problem that not everyone can reproduce. Please compile >>> with --disable-coroutine-pool on the configure command line. >>> >>> Paolo >> This patch also helps (at least for me, tested native and on Linux / Wine): >> http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 >> >> It looks like a compiler problem related to thread local storage >> (variable "current"). > Ugh. > >> I recently got several bug reports from a Windows user and included >> patches to fix them in >> my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on >> qemu.weilnetz.de >> are based on that tree. > Does something like > > CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); > > also work? Then we can just remove from_. > > Paolo Yes, that works, too. It also fixes the problem with the assertion (tested with Wine). No, we cannot remove from_, because the same interface is also used for Linux and other hosts which don't have a 'current' variable. Or we would have to call qemu_coroutine_self() to get the current coroutine. Stefan
Re: [Qemu-devel] qemu 1.6.1
Il 23/10/2013 21:26, Stefan Weil ha scritto: > Am 23.10.2013 11:00, schrieb Paolo Bonzini: >> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: >>> Hi, >>> >>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and >>> qemu-system-x86_64 when >>> booting from an install CD. >>> >>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom >>> NetBSD-6.1.2-amd64.iso >>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line >>> 99 >>> >>> This application has requested the Runtime to terminate it in an >>> unusual way. >>> Please contact the application's support team for more information. >>> >>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD >>> correctly. >>> No further info at this stage. >>> Any ideas? >> It's a known problem that not everyone can reproduce. Please compile >> with --disable-coroutine-pool on the configure command line. >> >> Paolo > > This patch also helps (at least for me, tested native and on Linux / Wine): > http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 > > It looks like a compiler problem related to thread local storage > (variable "current"). Ugh. > I recently got several bug reports from a Windows user and included > patches to fix them in > my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on > qemu.weilnetz.de > are based on that tree. Does something like CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); also work? Then we can just remove from_. Paolo
Re: [Qemu-devel] qemu 1.6.1
Am 23.10.2013 11:00, schrieb Paolo Bonzini: > Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: >> Hi, >> >> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and >> qemu-system-x86_64 when >> booting from an install CD. >> >> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom >> NetBSD-6.1.2-amd64.iso >> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line >> 99 >> >> This application has requested the Runtime to terminate it in an >> unusual way. >> Please contact the application's support team for more information. >> >> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD >> correctly. >> No further info at this stage. >> Any ideas? > It's a known problem that not everyone can reproduce. Please compile > with --disable-coroutine-pool on the configure command line. > > Paolo This patch also helps (at least for me, tested native and on Linux / Wine): http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 It looks like a compiler problem related to thread local storage (variable "current"). I recently got several bug reports from a Windows user and included patches to fix them in my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on qemu.weilnetz.de are based on that tree. Stefan
Re: [Qemu-devel] qemu 1.6.1
Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: > Hi, > > My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and > qemu-system-x86_64 when > booting from an install CD. > > C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom > NetBSD-6.1.2-amd64.iso > Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line > 99 > > This application has requested the Runtime to terminate it in an > unusual way. > Please contact the application's support team for more information. > > I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD > correctly. > No further info at this stage. > Any ideas? It's a known problem that not everyone can reproduce. Please compile with --disable-coroutine-pool on the configure command line. Paolo