Where has GETSIGINFO gone?

Shachar Shemesh Sat, 11 Jul 2009 01:08:15 -0700

Hi all,

I was somewhat surprised to find in LWN a message, posted to thislist[1], suggesting that a project of mine, fakeroot-ng, is a potentialbeneficiary of utrace. Truth be told, all utrace has offered me so faris pain.

In particular, when working with ptrace to perform genericvirtualization, one runs against an interesting problem. The core ptraceinterface is notifying the debugger about events delivered at thedebuggee. Whenever "interesting" events are reported (such as singlestep or a system call), this appears to the debugger to be a SIGTRAPdelivered at the debugee process. Particularly for system calls tracing,the debugger needs to keep track over how many times it was notified, asit will get two notifications for each system call - one upon entry andone upon exit. I'm fairly sure that I'm not saying anything which isnews to almost everyone on this list.

The problem is that, as a debugger, I need to be able to differentiatebetween a SIGTRAP supposedly delivered to the debuggee because I askedto trace the system calls, and a SIGTRAP actually delivered to thedebuggee. If I don't, my count is going to be off, and I will totallymis-interpret the debugee's state.

The best way, as far as I can tell, to do that on Linux is to use thePTRACE_GETSIGINFO command. This provides me with a field, si_code, thatcan distinguish between a signal and a system call. This is important tomake sure that I don't get confused over which is which.

Unfortunately, utrace (at least the version integrated into the FedoraCore 9 and Fedore 10 kernels) totally eliminated this system call. Whencalling ptrace with PTRACE_GETSIGINFO I get back "Invalid argument".

I've tried to figure out how other programs handle the situation.Looking at the strace sources, it seems to use a heuristics in order totry and detect this state. It relies on the fact that, on most Linuxplatforms, the kernel sets the return code register to -ENOSYS beforecalling the syscall enter ptrace hook, and tries to detect spuriousSIGTRACE if the value is not set. This solution has numerous deficiencies:


   * It is platform specific. On PowerPC, for example, the kernel does
     not, and strace has no way of telling the two cases apart.
   * It is non-reliable. The check can only be made on the syscall
     enter hook, not the exit hook.
   * It relies on internal kernel behavior
   * It is easy to fool by a malicious programmer. For example, send
     the signal from another process, have the first process do a tight
     loop where EAX (or whatever) is set to -ENOSYS, and strace will
     think you have entered a random system call, probably the last one
     again. Do that right after a fork or an exec, and all sorts of fun
     stuff will happen.

Since I'm aiming to use the fakeroot-ng technology for security relatedstuff (not in fakeroot-ng - I intend to split the project), thesedrawbacks are fatal.

Don't get me wrong. I think cleaning up the debugger interfaces insidethe kernel is an excellent idea. I just don't think breaking user spacecompatibility over the old interface, broken though you might think itis, is justified. This is directed not so much against the utraceproject as it is against RedHat including it in production kernels.


Shachar

[1] - http://www.redhat.com/archives/utrace-devel/2009-March/msg00112.html

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.comhttp://www.redhat.com/archives/utrace-devel/2009-March/msg00112.html

Where has GETSIGINFO gone?

Reply via email to