Hi -

I'm making good progress on the multi-client libpager.  I've been running
it on my root filesystem for about a month now, with few problems recently.

However, there are still some bugs.  One seems to be in libports.  It
manifests like this:

/hurd/ext2fs.static: ../../libports/../libshouldbeinlibc/refcount.h:171:
refcounts_ref: Assertion '! (r.hard == 1 && r.weak == 0) || !"refcount
detected use-after-free!"' failed.
/hurd/ext2fs.static: ../../libports/complete-deallocate.c:41:
_ports_complete_deallocate: Assertion '! "reacquired reference w/o send
rights"' failed.

gdb indicates that the port in question was generated by
libfshelp/get-identity.c.  That file's a short read; basically, we're
storing ports in a inode-to-port hash, looking them up when io_identity()
gets called, and removing them from the hash when the class's clean routine
gets called.

I think what's happening is that we have a port that loses its last send
right, and after its refcount is decremented but before its clean routine
gets called, another call to io_identity() pulls it out of the hash.  Then
you've got ports_get_right complaining (that's the first line) that it's
incrementing a zero refcount, and ports_port_deref complaining (that's the
second line) that it deallocating a port that now has send rights.

Looking at the tail end of libports/no-senders.c, you'll see that
ports_port_deref gets called after we've dropped the mutex on _ports_lock.
I'm thinking that we need to hold that mutex all the way until the class's
clean routine has returned in order to assure that the refcount get
decremented and the port gets removed from the hash atomically.

Of course, that requires holding a global lock while the clean routine
runs.  It seems to me that only the port in question needs to be locked,
but the individual ports don't seem to have mutexs associated with them.

Any ideas what to do?

    agape
    brent

Reply via email to