Aloha -

I've been trying to debug a problem in rpctrace which causes rpctrace to
crash when I use it to wrap /hurd/ext2fs.

The bug is triggered by a memory_object_lock_request /
memory_object_lock_completed sequence.  Specifically, ext2fs sends a lock
request to the kernel with a send-once reply_to port.  Once the lock is
complete, the kernel sends a memory_object_lock_completed message to the
send-once right, including a send right to the memory control port (for
identification purposes) and that's where the trouble starts in rpctrace.

rpctrace is designed to trace multiple tasks simultaneously, and to
identify which task is doing an RPC, it allocates separate ports for each
task (even if they wrap the same port).  So, if you pass out a send right
to three tasks, three different receive rights will be allocated in
rpctrace.  Which receive right a message comes in on indicates which task
is sending the message.

For this to work right, we need to identify which task is on the receiving
end of a send right.  Not which task the send right came from, mind you,
but the ultimate destination.  In the previous example, we transferring a
copy of the original send right, we need to pick which of the three new
send rights should be transferred, based on the ultimate destination of the
message, to ensure that the right task gets the right version of the port.
There's also a fourth case - we're transferring to a task that we're not
tracing.

All of this complexity is already built into rpctrace.  It plays games like
looking at a task's port space, extracting a send right from each remote
receive right, and checking to see if it matches a local send right, in
order to determine that the local send right's final destination is the
remote receive right on the task in question.  See discover_receive_right().

In my case, problems arise because a send-once right is used to return the
lock completed message, and there's no way to know which task the send-once
right ultimately goes to.  There's a bad pointer deference involved, but
even once that's fixed, how do you know which send right to transfer?  It's
important to get it right, since the send right is used by the memory
manager to identify different clients.

I've patched it up by assuming that the task sending the send-once right is
the ultimate destination, which works in my case, but obviously it isn't
right in general.

The more I think about it, the more I'm thinking that it's a design flaw in
rpctrace.  We need to identify ultimate destinations, but can't do that
reliably.

I read on the website's hurd/debugging/rpctrace page that somebody (zhenga)
had come with a new version of rpctrace.  Do we have a copy of it around
somewhere?

I could submit the patches that I've got, but they're not right, and I
don't see any way to make them right.  I'm thinking now that the way to fix
it is to redesign rpctrace so that each wrapped task gets a separate
rpctrace task wrapping it.  That way, we should be able to determine which
task makes which RPC without the problems I've described above.

I'm also thinking that I don't want to undertake rewriting rpctrace right
now.  I was just trying to fix it so that I could understand what ext2fs
was doing.

Comments?

    agape
    brent

Reply via email to