Aloha -

I've run into a real kernel-level problem with 'netmsg'.

It's related to the libpager issue.  The problem arises when a process gets
an unworkable memory object and tries to vm_map it.  This causes the
mach_msg() that sent the vm_map to block indefinitely, even though I've
specified MACH_SEND_TIMEOUT with a zero timeout.

More specifically, the process in question is the exec server.  It gets a
memory object from the file server to read a file, then uses it to map the
file into a remote task.  This causes a vm_map to go across the network
connection.  The kernel, upon receiving the vm_map, sends a
memory_object_init message, and then blocks waiting for the reply.

The block occurs in vm_object_copy_strategically(), which is labeled in its
comments "[t]his operation may block".  Almost the first thing it does it
to wait for the memory object to become ready.

In our case, libpager already has a different client, so the memory object
never becomes ready.

The big problem, as I see it, is that mach_msg() is blocking, and that
hangs my entire thread.  It seems to me that these low-level RPC operations
like vm_map can't block, otherwise it would defeat the purpose of
MACH_SEND_TIMEOUT.  So vm_map() should record the mapping and then return,
putting the copy operation on some kind of queue.  I guess.

Any thought on how to resolve this?

    agape
    brent

Reply via email to