I recently fixed a bug on FreeBSD related to DTrace's emulation of the leave instruction. A full explanation can be found at [1], but the gist of the problem was that there is a small window in which critical state is held above the current stack pointer. If an interrupt was taken in that window, the trap frame would corrupt this state, leading to a crash.
I checked the Illumos source([2]), and Illumos' version of the emulation code was identical(no surprise, as the FreeBSD implementation was straight port from OpenSolaris). To be honest, I'd be rather surprised if Solaris and Illumos has been subject to the same bug for so long -- I first reproduced the crash on FreeBSD by accident, and the D script that I had written trying to debug a different problem was triggering a crash within minutes under my workload. The only things that I can figure are either few people were running Solaris/OpenSolaris/Illumos on i386 and using DTrace(dubious), the Solaris/OpenSolaris/Illumos compiler rarely generates leave instructions, or the invalid opcode fault handler in Solaris/OpenSolaris/Illumos runs with interrupts disabled. I don't see a cli instruction in the Illumos fault handler, but I'm really not familiar with the low-level details of i386 fault handling and it's possible that the task descriptor for the fault handler can specify that it runs with interrupts disabled. I'm especially worried about the latter case, as there may well be many other places in DTrace that are implicitly assuming that interrupts are disabled. For example, I know that DTrace makes extensive use of per-CPU buffers for the data that it gathers. That code could easily be assuming that is being run in a critical section, and a miss-timed interrupt might lead to the data being corrupted. Any advice that could be offered on this would be most welcome. If, by chance, Illumos is running the invalid opcode handler with interrupts enabled, I believe that the following patch against uts/intel/ia32/ml/exception.s will fix the problem(Disclaimer: I haven't even tried to compile this, let alone test this. But essentially the same patch fixed the FreeBSD crash that I was seeing). --- a/exception.s 2011-11-12 13:21:27.000000000 -0500 +++ b/exception.s 2011-11-12 13:23:13.000000000 -0500 @@ -587,11 +587,11 @@ movl 8(%esp), %eax /* load calling EIP */ incl %eax /* increment over LOCK prefix */ movl %eax, -8(%ebx) /* store calling EIP */ - movl %ebx, -4(%esp) /* temporarily store new %esp */ + subl $8, %ebx /* adjust for three pushes, one pop */ + movl %ebx, 8(%esp) /* temporarily store new %esp */ popl %ebx /* pop off temp */ popl %eax /* pop off temp */ - movl -12(%esp), %esp /* set stack pointer */ - subl $8, %esp /* adjust for three pushes, one pop */ + movl (%esp), %esp /* set stack pointer */ jmp _emul_done 4: /* [1] - http://lists.freebsd.org/pipermail/freebsd-current/2011-November/029007.html [2] - http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/intel/ia32/ml/exception.s#572 ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com