So now we have four possibles: extending ptrace, a new
ptrace-replacement syscall, a systemtap-based thing, and, now, ntrace.
Things are diverging rather than converging.
A couple of comments concerning the /proc entry fd method utracer uses
and than some people have expressed a preference for: The first is that
it's clumsy. When the utracer module loads, it creates a /proc
pseudo-directory /proc/utrace and then a pseudo-file
/proc/utrace/control. When an app wants access to utracer services it
opens that control file and ioctl()s [1] a register request to it.
This causes the module to create another pseudo-directory
/proc/utrace/app-pid and two new pseudo-files
/proc/utrace/app-pid/cmd and /proc/utrace/app-pid/resp. The app
then has to open those two pseudo files, using asprintf() or some such
to build the file name strings. This strikes me as a ridiculous amount
of arm-waving.
Further, /proc entries are meant to be publicly accessible ways of
accessing kernel data and, to a lesser extent (possibly even including
zero--I can't think of an example at the moment), services. utracer
violates this implicit paradigm. The utracer entries under /proc/utrace
are intended to be exclusively a means of communication between the
registering app and the module--I even set the permissions to make sure
of that--and might even violate the data vs. services thing.
Further, using /proc entries requires that the module keep appropriate
structs for each of those entries and to close the entries and clean up
the structs when the requesting app de-registers. All cool except when
the requesting app either crashes or the app writer just exits without
bothering to de-register first--you can build up a lot of dead structs
that way. utracer solves this problem by utrace-attaching an engine to
every registering app just to get notification of it's unexpected demise
and then run around cleaning up after it.
All of the foregoing is by way of saying that the /proc entry fd method,
even though it looks fairly cool, has it's problems and is fairly
inefficient--not even including the unknown (to me, at least)
inefficiencies of overhead in the read()/write()/ioctl() mechanism.
This isn't to say, however, that the fd (hence select()/poll()/whatever)
approach can't be made to work. I haven't even done five minutes of
research on it, but it seems possible to me that a ptrace() extension
request could be something like fd = ptrace(UTRACE_GET_FD, ...) (or
equivalent new utrace syscall request) which returns an fd suitable for
use by poll() and select(), and possibly readable, writable, and
ioctl-able as well, without having an actual underlying pseudo-file and
the attendant overhead and problems described above.
[1] Most control operations in utracer are based on ioctl() rather than
read() and write(). utracer supports a fair number of requests for
information [2] but read() has no way of specifying which information
the app is after and write() has no way of retrieving the information.
ioctl() is open-ended in that regard and can specify in arbitrary detail
not only what information the app wants but provide a pointer to a place
to store complex results as well. The down side is that while simple
mechanisms are documented by which to register read() and write()
methods I couldn't find any documented mechanism by which to register an
ioctl() method. I found a way that works: There's a struct associated
with top level /proc entries that contains a pointer to another struct
that contains a pointer, invariably null so far as I can tell, to an
ioctl method. In other than top-level /proc entries, the pointer to the
secondary struct is null so I make a copy of the top level struct, stuff
in a pointer to my own ioctl() method, then point the original
subordinate struct at the modified copy. This is so appallingly clumsy
that either I'm very badly missing something or no one has ever needed
to ioctl() to a /proc entry before and the mechanism to do so hasn't
been developed. The scary part of my hack, of course, is that you never
know when someone's going to change something and break it.
[2] Information like, e.g. a task's mapped memory regions. This is
available just by reading the right pseudo-file in /proc/pid, but when
an app reads that file, the kernel waves its arms a while extracting
data from the relevant struct mm_struct and struct vm_area_struct,
formatting it into ascii strings, and sending it to the requesting app.
The app then has to wave /its/ arms for a while parsing the ascii
strings to get back exactly the same information the kernel had in the
first place. This rather badly offends my engineer's sense of
efficiency, so utracer provides a means of accessing such information
directly, in binary, in a struct the app doesn't have to parse. [3]
[3] Yes, I know [2] was a subordinate footnote to [1] and considered
Bad Form by some.