Re: Reauthentication implementation flaw due to EINTR
On Tue, Dec 29, 2009 at 11:46:09PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 23:40:21 +0100, a écrit : On Tue, Dec 29, 2009 at 11:10:00PM +0100, Samuel Thibault wrote: That's the same. To make it clearer, see attached patch. I meant in the other branch of the if (though for some reason I didn't consider the server part's if). Err, what for? In that case, dead name notification isn't requested, is it? Or is it always requested by default and ports_interrupt_self_on_port_death() is just to change the behavior? (I'm a complete newbie for such parts of the Hurd...) Nothing magical is happening. The notification has been requested by auth_server_authenticate which is waiting for auth_user_authenticate to arive. In the then part of the if statement, the user has found that the server is waiting for the condition to be signaled, which is right time to cancel the notification. If canceled by the server after the notification has been signaled there is a window between receiving the condition signal and the cancelation where it can still be interrupted. It's a very, very narrow window, but I think it could still affect the reply. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
On Tue, Dec 29, 2009 at 11:34:08PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 23:29:35 +0100, a écrit : On Tue, Dec 29, 2009 at 10:55:41PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 22:48:26 +0100, a écrit : It's still a mystery to me why ext2fs gets an EINTR without auth explicitly returning EINTR... AIUI, the RPC gets aborted and ext2fs receives MACH_RCV_PORT_DIED, which _hurd_intr_rpc_mach_msg turns into EINTR. Wait, isn't the receive port in this case the auth port? This could be bad... Oops, I meant MACH_RCV_TIMED_OUT above. Phew! :-) This might make sense if auth's reply is interrupted. It is still a bit surprising though, considering that auth_server_authenticate does not set a timeout. BTW, reproducing the bug is really easy: while sudo date ; do : ; done OK, I can use that for the client side. I don't really have a setup for testing the auth server though. Do you use a sub-Hurd or something? I've never managed to find time to set up a sub-hurd, so I just debug with writes on the console :) Ah, that's a classic technique. But how do you replace the auth server? Do you install a modified one and just reboot? Seems kinda risky. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Wed 30 Dec 2009 12:40:02 +0100, a écrit : On Tue, Dec 29, 2009 at 11:46:09PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 23:40:21 +0100, a écrit : On Tue, Dec 29, 2009 at 11:10:00PM +0100, Samuel Thibault wrote: That's the same. To make it clearer, see attached patch. I meant in the other branch of the if (though for some reason I didn't consider the server part's if). Err, what for? In that case, dead name notification isn't requested, is it? Or is it always requested by default and ports_interrupt_self_on_port_death() is just to change the behavior? (I'm a complete newbie for such parts of the Hurd...) Nothing magical is happening. I don't mean magical, but automatic. The notification has been requested by auth_server_authenticate which is waiting for auth_user_authenticate to arive. Only when the user isn't arrived already. In the then part of the if statement, the user has found that the server is waiting for the condition to be signaled, which is right time to cancel the notification. Err, isn't the notification bound to an RPC? (really, I don't know anything about these and the Mach documentation doesn't help me so much). Samuel
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Wed 30 Dec 2009 13:08:38 +0100, a écrit : But how do you replace the auth server? Do you install a modified one and just reboot? Yes. Seems kinda risky. For just additional prints it's not. I'm running it in a virtual machine so I can easily replace with the original exec binary. Samuel
Re: Reauthentication implementation flaw due to EINTR
On Wed, Dec 30, 2009 at 02:01:12PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Wed 30 Dec 2009 12:40:02 +0100, a écrit : The notification has been requested by auth_server_authenticate which is waiting for auth_user_authenticate to arive. Only when the user isn't arrived already. Right, but we know it hasn't since the server has posted a pending operation, otherwise we'd be in the else part of the if. In the then part of the if statement, the user has found that the server is waiting for the condition to be signaled, which is right time to cancel the notification. Err, isn't the notification bound to an RPC? (really, I don't know anything about these and the Mach documentation doesn't help me so much). The low-level interface is bound to the port we want notifications about. The rendezvous port in this case, which is the same for auth_user and auth_server. libports associates it with the RPC and thread too though, so to clean it up properly I guess we'd need to pass such information through the pending structure. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Wed 30 Dec 2009 16:38:35 +0100, a écrit : On Wed, Dec 30, 2009 at 02:01:12PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Wed 30 Dec 2009 12:40:02 +0100, a écrit : The notification has been requested by auth_server_authenticate which is waiting for auth_user_authenticate to arive. Only when the user isn't arrived already. Right, but we know it hasn't since the server has posted a pending operation, otherwise we'd be in the else part of the if. Right. In the then part of the if statement, the user has found that the server is waiting for the condition to be signaled, which is right time to cancel the notification. Err, isn't the notification bound to an RPC? (really, I don't know anything about these and the Mach documentation doesn't help me so much). The low-level interface is bound to the port we want notifications about. Ok. I've tried the attached patch (against the current unpatched git head), still no luck: no auth EINTR printed, but auth_server_authenticate returned EINTR in ext2fs. Samuel --- auth.c.orig 2009-12-30 17:18:13.0 +0100 +++ auth.c 2009-12-30 17:24:25.0 +0100 @@ -306,6 +306,8 @@ s-user = userauth; ports_port_ref (userauth); + mach_port_t old; + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); condition_signal (s-wakeup); mutex_unlock (pending_lock); @@ -333,7 +335,10 @@ { hurd_ihash_locp_remove (pending_users, u.locp); err = EINTR; + mylog(auth EINTR\n); } + mach_port_t old; + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); } /* The server side has already removed U from the ihash table. */ mutex_unlock (pending_lock); @@ -393,6 +398,8 @@ /* Give the user the new port and wake the RPC up. */ u-passthrough = newport; + mach_port_t old; + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); condition_signal (u-wakeup); mutex_unlock (pending_lock); } @@ -415,7 +422,10 @@ { hurd_ihash_locp_remove (pending_servers, s.locp); err = EINTR; + mylog(auth EINTR\n); } + mach_port_t old; + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); } /* The user side has already removed S from the ihash table. */ mutex_unlock (pending_lock);
Re: Reauthentication implementation flaw due to EINTR
Hi, On Mon, Dec 28, 2009 at 03:43:47AM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Sun 27 Dec 2009 22:24:24 +0100, a écrit : OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). But in another task (auth vs ext2fs). That's where I've still not found what makes ext2fs return EINTR. When rendezvous port dies we get the notification: Here we seemingly see ext2fs get the notification, but why? ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. I was hoping _hurdsig_abort_rpcs made auth_server_authenticate return EINTR somehow by interrupting auth_server_authenticate_reply (using thread_abort). But studying the code further it seems only interrupt the sending and not resend an EINTR message. (The only problem I can think of happening here is that the send isn't retried in which case ext2fs would hang.) So I looked further at libports which does a lot of bookkeeping of the pending RPCs, and so should be able to hijack the them to return EINTR. But none of the relevant code has any use of EINTR, so I don't think that's it either. It's still a mystery to me why ext2fs gets an EINTR without auth explicitly returning EINTR... Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Tue 29 Dec 2009 22:48:26 +0100, a écrit : It's still a mystery to me why ext2fs gets an EINTR without auth explicitly returning EINTR... AIUI, the RPC gets aborted and ext2fs receives MACH_RCV_PORT_DIED, which _hurd_intr_rpc_mach_msg turns into EINTR. BTW, reproducing the bug is really easy: while sudo date ; do : ; done Samuel
Re: Reauthentication implementation flaw due to EINTR
On Mon, Dec 28, 2009 at 03:24:52AM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Sun 27 Dec 2009 20:56:21 +0100, a écrit : If the notification request is canceled before auth_user_authenticate returns there should be no problem. I tried a crude error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); after the complete if (hurd_condition_wait (s.wakeup, pending_lock) statement, with no luck: ext2fs got an EINTR anyway. Err, shouldn't be in the user part after it finds the pending server if we want to do it before auth_user_authenticate returns? (I know you said something later about only testing this in the unpatched version, but it should work in the unpatched version if the problem is indeed the dead-name notification.) Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Tue 29 Dec 2009 23:01:06 +0100, a écrit : On Mon, Dec 28, 2009 at 03:24:52AM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Sun 27 Dec 2009 20:56:21 +0100, a écrit : If the notification request is canceled before auth_user_authenticate returns there should be no problem. I tried a crude error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); after the complete if (hurd_condition_wait (s.wakeup, pending_lock) statement, with no luck: ext2fs got an EINTR anyway. Err, shouldn't be in the user part after it finds the pending server if we want to do it before auth_user_authenticate returns? That's the same. To make it clearer, see attached patch. Samuel --- auth.c.orig 2009-12-29 23:07:54.0 +0100 +++ auth.c 2009-12-29 23:08:41.0 +0100 @@ -328,18 +328,19 @@ condition_init (u.wakeup); ports_interrupt_self_on_port_death (userauth, rendezvous); if (hurd_condition_wait (u.wakeup, pending_lock) hurd_ihash_find (pending_users, rendezvous)) /* We were interrupted; remove our record. */ { hurd_ihash_locp_remove (pending_users, u.locp); err = EINTR; } + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); } /* The server side has already removed U from the ihash table. */ mutex_unlock (pending_lock); if (! err) { /* The server RPC has set the port and signalled U.wakeup. */ *newport = u.passthrough; *newporttype = MACH_MSG_TYPE_MOVE_SEND; @@ -410,18 +411,19 @@ condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); if (hurd_condition_wait (s.wakeup, pending_lock) hurd_ihash_find (pending_servers, rendezvous)) /* We were interrupted; remove our record. */ { hurd_ihash_locp_remove (pending_servers, s.locp); err = EINTR; } + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); } /* The user side has already removed S from the ihash table. */ mutex_unlock (pending_lock); if (err) return err; /* The user RPC has set the port (with a ref) and signalled S.wakeup. */ user = s.user;
Re: Reauthentication implementation flaw due to EINTR
On Tue, Dec 29, 2009 at 10:55:41PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 22:48:26 +0100, a écrit : It's still a mystery to me why ext2fs gets an EINTR without auth explicitly returning EINTR... AIUI, the RPC gets aborted and ext2fs receives MACH_RCV_PORT_DIED, which _hurd_intr_rpc_mach_msg turns into EINTR. Wait, isn't the receive port in this case the auth port? This could be bad... BTW, reproducing the bug is really easy: while sudo date ; do : ; done OK, I can use that for the client side. I don't really have a setup for testing the auth server though. Do you use a sub-Hurd or something? Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Tue 29 Dec 2009 23:29:35 +0100, a écrit : On Tue, Dec 29, 2009 at 10:55:41PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Tue 29 Dec 2009 22:48:26 +0100, a écrit : It's still a mystery to me why ext2fs gets an EINTR without auth explicitly returning EINTR... AIUI, the RPC gets aborted and ext2fs receives MACH_RCV_PORT_DIED, which _hurd_intr_rpc_mach_msg turns into EINTR. Wait, isn't the receive port in this case the auth port? This could be bad... Oops, I meant MACH_RCV_TIMED_OUT above. BTW, reproducing the bug is really easy: while sudo date ; do : ; done OK, I can use that for the client side. I don't really have a setup for testing the auth server though. Do you use a sub-Hurd or something? I've never managed to find time to set up a sub-hurd, so I just debug with writes on the console :) Samuel
Re: Reauthentication implementation flaw due to EINTR
Samuel Thibault, le Tue 29 Dec 2009 23:34:08 +0100, a écrit : OK, I can use that for the client side. I don't really have a setup for testing the auth server though. Do you use a sub-Hurd or something? I've never managed to find time to set up a sub-hurd, so I just debug with writes on the console :) More precisely, I use #include stdio.h #include device/device.h static void mylog(const char *fmt, ...) { static mach_port_t device = MACH_PORT_NULL; if (device == MACH_PORT_NULL) { mach_port_t priv, dev; if (! get_privileged_ports(NULL, priv) ! device_open(priv, D_WRITE, console, dev)) device = dev; } if (device != MACH_PORT_NULL) { char s[128]; int n; va_list vap; va_start(vap, fmt); n = vsnprintf(s, sizeof(s), fmt, vap); va_end(vap); device_write(device, 0, 0, s, n, n); } } Samuel
Re: Reauthentication implementation flaw due to EINTR
On Tue, Dec 29, 2009 at 11:10:00PM +0100, Samuel Thibault wrote: That's the same. To make it clearer, see attached patch. I meant in the other branch of the if (though for some reason I didn't consider the server part's if). Patch attatched for clarity. Regards, Fredrik diff --git a/auth/auth.c b/auth/auth.c index 3c5fa86..737fd98 100644 --- a/auth/auth.c +++ b/auth/auth.c @@ -294,20 +294,21 @@ S_auth_user_authenticate (struct authhandle *userauth, s = hurd_ihash_find (pending_servers, rendezvous); if (s) { /* Found it! Extract the port. */ *newport = s-passthrough; *newporttype = MACH_MSG_TYPE_MOVE_SEND; /* Remove it from the pending list. */ hurd_ihash_locp_remove (pending_servers, s-locp); + error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); /* Give the server the auth port and wake the RPC up. We need to add a ref in case the port dies. */ s-user = userauth; ports_port_ref (userauth); condition_signal (s-wakeup); mutex_unlock (pending_lock); mach_port_deallocate (mach_task_self (), rendezvous); return 0;
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Tue 29 Dec 2009 23:40:21 +0100, a écrit : On Tue, Dec 29, 2009 at 11:10:00PM +0100, Samuel Thibault wrote: That's the same. To make it clearer, see attached patch. I meant in the other branch of the if (though for some reason I didn't consider the server part's if). Err, what for? In that case, dead name notification isn't requested, is it? Or is it always requested by default and ports_interrupt_self_on_port_death() is just to change the behavior? (I'm a complete newbie for such parts of the Hurd...) Samuel
Re: Reauthentication implementation flaw due to EINTR
Hi, On 09-12-22 上午3:43, Samuel Thibault wrote: Hello, I had been noticing odd issues with sudo when it calls setresuid such, it took me some time to understand that there was a flaw in the reauthentication implementation: sudo calls setresuid(), which calls setauth(), which (for each FD such) allocates a rendez-vous port, calls io_reauthenticate() (RPC in the underlying FS which calls the auth_server_authenticate() RPC) and calls the auth_user_authenticate() RPC. These last two RPCs end up in auth, which uses the rendez-vous port passed along to make the match. Whichever arrives first leaves information and a condition variable for the other ; when the latter arrives, it fills its information and wakes the former. The issue is that currently, once the user part gets its passthrough port from the server part, it returns immediately, and setauth() drops the rendez-vous port, which actually interrupts the server RPCs because the rendez-vous sender right becomes dead. Quite often scheduling makes it so that the user is not so fast and the server has time to finish its duties, but due to the high usage of setresuid in sudo, one every few tens of sudo calls fail. The fix I'll use at least on the Debian buildds for now is to make the auth_user_authenticate() RPC always wait for auth_server_authenticate() to have called auth_server_authenticate_reply() before returning. I've been running that in a tight loop the whole afternoon with no issue, so at least it seems to work much better. However, I'd prefer to make sure that it works _always_ :) I wonder if this bug causes my modified subhurd to hang in the place where `login` is executed. In the modified subhurd, all RPCs are forwarded to `boot`. Maybe it magnifies the problem here. Samuel, could you send me your fix? I want to see if your patch can make it work. Thanks, Zheng Da
Re: Reauthentication implementation flaw due to EINTR
On 09-12-27 上午1:47, Carl Fredrik Hammar wrote: Is it the code below from S_auth_server_authenticate the problem? /* Store the new port and wait for the user RPC to wake us up. */ s.passthrough = newport; condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); The reason we need this is to avoid this thread hangs forever if the client somehow dies? Zheng Da
Re: Reauthentication implementation flaw due to EINTR
Hi, On 09-12-28 上午3:56, Carl Fredrik Hammar wrote: OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. So the registered deadname notification can cancel the RPC (i.e, auth_server_authenticate RPC is this case)? and the thread will somehow jump to some place to return EINTR and doesn't execute hurd_condition_wait() and the code below it? If the notification request is canceled before auth_user_authenticate returns there should be no problem. I don't know how to do that, as libports doesn't seem to have a function for it. The primitive function would be mach_port_request_notification with a null port, but I assume there is other stuff to clean up. Is it OK to extend libports to have some function to cancel the notification request? Zheng Da
Re: Reauthentication implementation flaw due to EINTR
Da Zheng, le Mon 28 Dec 2009 17:58:47 +0800, a écrit : On 09-12-27 上午1:47, Carl Fredrik Hammar wrote: Is it the code below from S_auth_server_authenticate the problem? /* Store the new port and wait for the user RPC to wake us up. */ s.passthrough = newport; condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); The reason we need this is to avoid this thread hangs forever if the client somehow dies? Yes. Samuel
Re: Reauthentication implementation flaw due to EINTR
Da Zheng, le Mon 28 Dec 2009 17:59:55 +0800, a écrit : On 09-12-28 上午3:56, Carl Fredrik Hammar wrote: OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. So the registered deadname notification can cancel the RPC (i.e, auth_server_authenticate RPC is this case)? and the thread will somehow jump to some place to return EINTR and doesn't execute hurd_condition_wait() and the code below it? It does, but that's not a problem in the case of auth now that I have patched it. Where this is a problem is in ext2fs which was calling auth_server_authenticate(). That function gets an EINTR from mach_msg and returns it. If the notification request is canceled before auth_user_authenticate returns there should be no problem. I don't know how to do that, as libports doesn't seem to have a function for it. The primitive function would be mach_port_request_notification with a null port, but I assume there is other stuff to clean up. Is it OK to extend libports to have some function to cancel the notification request? I actually believe that's already the case: libports remembers the notification request and disables it on RPC termination, which should be enough for itself. What I still don't understand is the link between auth requesting a notification and ext2fs also getting the EINTR. Samuel
Re: Reauthentication implementation flaw due to EINTR
Da Zheng, le Mon 28 Dec 2009 17:58:26 +0800, a écrit : Samuel, could you send me your fix? Here it is. Samuel diff --git a/auth/auth.c b/auth/auth.c index 2afeaf7..11db0f8 100644 --- a/auth/auth.c +++ b/auth/auth.c @@ -251,11 +251,22 @@ S_auth_makeauth (struct authhandle *auth, /* Transaction handling. */ -/* A pending transaction. */ -struct pending +/* Since the user is responsible for freeing the rendezvous port, it has to + * wait for the server to have finished transmitting uids. + * + * The server thus waits for the user to give it uids (unless it was already + * there), transmits them and provides the passthrough port. + * + * The user gives the uids and waits for the passthrough port from the server. + * + * If the user is early, it has to tell the server it arrived. + */ + +/* A pending user. */ +struct pending_user { -hurd_ihash_locp_t locp;/* Position in one of the ihash tables. */ -struct condition wakeup;/* The waiter is blocked on this condition. */ +hurd_ihash_locp_t locp;/* Position in the pending_users ihash table. */ +struct condition wakeup;/* The reader is blocked on this condition. */ /* The user's auth handle. */ struct authhandle *user; @@ -264,11 +275,18 @@ struct pending mach_port_t passthrough; }; +/* A pending server. */ +struct pending_server + { +hurd_ihash_locp_t locp;/* Position in the pending_servers ihash table. */ +struct condition wakeup;/* The server is blocked on this condition. */ + }; + /* Table of pending transactions keyed on RENDEZVOUS. */ struct hurd_ihash pending_users - = HURD_IHASH_INITIALIZER (offsetof (struct pending, locp)); + = HURD_IHASH_INITIALIZER (offsetof (struct pending_user, locp)); struct hurd_ihash pending_servers - = HURD_IHASH_INITIALIZER (offsetof (struct pending, locp)); + = HURD_IHASH_INITIALIZER (offsetof (struct pending_server, locp)); struct mutex pending_lock = MUTEX_INITIALIZER; /* Implement auth_user_authenticate as described in hurd/auth.defs. */ @@ -280,7 +298,9 @@ S_auth_user_authenticate (struct authhandle *userauth, mach_port_t *newport, mach_msg_type_name_t *newporttype) { - struct pending *s; + struct pending_server *s; + struct pending_user u; + error_t err; if (! userauth) return EOPNOTSUPP; @@ -288,65 +308,54 @@ S_auth_user_authenticate (struct authhandle *userauth, if (rendezvous == MACH_PORT_DEAD) /* Port died in transit. */ return EINVAL; + u.user = userauth; + condition_init (u.wakeup); + mutex_lock (pending_lock); - /* Look for this port in the server list. */ - s = hurd_ihash_find (pending_servers, rendezvous); - if (s) -{ - /* Found it! Extract the port. */ - *newport = s-passthrough; - *newporttype = MACH_MSG_TYPE_MOVE_SEND; + err = hurd_ihash_add (pending_users, rendezvous, u); + if (err) { +mutex_unlock (pending_lock); +return err; + } - /* Remove it from the pending list. */ - hurd_ihash_locp_remove (pending_servers, s-locp); + /* Give the server the auth port. + We need to add a ref in case the port dies. */ + ports_port_ref (userauth); - /* Give the server the auth port and wake the RPC up. -We need to add a ref in case the port dies. */ - s-user = userauth; - ports_port_ref (userauth); + /* Look for this rendezvous in the server list. */ + s = hurd_ihash_find (pending_servers, rendezvous); + if (s) { +/* Found it! */ - condition_signal (s-wakeup); - mutex_unlock (pending_lock); +/* Remove it from the pending list. */ +hurd_ihash_locp_remove (pending_servers, s-locp); - mach_port_deallocate (mach_task_self (), rendezvous); - return 0; -} - else +/* Tell it we eventually arrived. */ +condition_signal (s-wakeup); + } + + ports_interrupt_self_on_port_death (userauth, rendezvous); + /* Wait for server answer. */ + if (hurd_condition_wait (u.wakeup, pending_lock) + hurd_ihash_find (pending_users, rendezvous)) +/* We were interrupted; remove our record. */ { - /* No pending server RPC for this port. -Create a pending user RPC record. */ - struct pending u; - error_t err; + hurd_ihash_locp_remove (pending_users, u.locp); + err = EINTR; +} - err = hurd_ihash_add (pending_users, rendezvous, u); - if (! err) - { - /* Store the user auth port and wait for the server RPC to wake - us up. */ - u.user = userauth; - condition_init (u.wakeup); - ports_interrupt_self_on_port_death (userauth, rendezvous); - if (hurd_condition_wait (u.wakeup, pending_lock) - hurd_ihash_find (pending_users, rendezvous)) - /* We were interrupted; remove our record. */ - { - hurd_ihash_locp_remove (pending_users, u.locp); - err = EINTR; -
Re: Reauthentication implementation flaw due to EINTR
Samuel Thibault, le Mon 28 Dec 2009 11:14:04 +0100, a écrit : Da Zheng, le Mon 28 Dec 2009 17:59:55 +0800, a écrit : On 09-12-28 上午3:56, Carl Fredrik Hammar wrote: OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. So the registered deadname notification can cancel the RPC (i.e, auth_server_authenticate RPC is this case)? and the thread will somehow jump to some place to return EINTR and doesn't execute hurd_condition_wait() and the code below it? It does, but that's not a problem in the case of auth now that I have patched it. Err, no, I mean the RPC is interrupted and hurd_condition_wait returns 1 instead of 0, but that's not a problem etc. Samuel
Re: Reauthentication implementation flaw due to EINTR
On 09-12-28 下午6:14, Samuel Thibault wrote: Da Zheng, le Mon 28 Dec 2009 17:59:55 +0800, a écrit : On 09-12-28 上午3:56, Carl Fredrik Hammar wrote: OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. So the registered deadname notification can cancel the RPC (i.e, auth_server_authenticate RPC is this case)? and the thread will somehow jump to some place to return EINTR and doesn't execute hurd_condition_wait() and the code below it? It does, but that's not a problem in the case of auth now that I have patched it. My question was where the thread jumps to when it is interrupted by the deadname notification. I thought the thread directly jumps to a place to return EINTR instead of executing the code below hurd_condition_wait(). That is, the deadname notification doesn't interrupt hurd_condition_wait(), but interrupt the whole auth_server_authenticate RPC. My understand may be wrong. I just see the comment in ports_interrupt_self_on_notification: Arrange for hurd_cancel to be called on the current thread, which should be an rpc on OBJECT, if PORT gets notified with the condition WHAT. OBJECT is userauth and PORT is the rendezvous port in our case. Doesn't it mean the RPC on userauth (namely, auth_user_authenticate) is to be canceled? Where this is a problem is in ext2fs which was calling auth_server_authenticate(). That function gets an EINTR from mach_msg and returns it. If the notification request is canceled before auth_user_authenticate returns there should be no problem. I don't know how to do that, as libports doesn't seem to have a function for it. The primitive function would be mach_port_request_notification with a null port, but I assume there is other stuff to clean up. Is it OK to extend libports to have some function to cancel the notification request? I actually believe that's already the case: libports remembers the notification request and disables it on RPC termination, which should be enough for itself. What I still don't understand is the link between auth requesting a notification and ext2fs also getting the EINTR. As Fredrik said and as I understand, while the server side of auth_server_authenticate RPC is waiting for the signal from the auth_user_authenticate, auth_server_authenticate RPC can be canceled by the deadname notification if the client gets scheduled first and destroys the rendezvous port and thus the auth_server_authenticate RPC returns EINTR. The right thing to do is that before the server side of auth_user_authenticate RPC returns, it cancels the deadname notification. In this case, auth_server_authenticate RPC can never be interrupted by the deadname notification any more even if the rendezvous port is destroyed before auth_server_authenticate RPC gets its chance to run. Zheng Da
Re: Reauthentication implementation flaw due to EINTR
Da Zheng, le Mon 28 Dec 2009 19:03:26 +0800, a écrit : Doesn't it mean the RPC on userauth (namely, auth_user_authenticate) is to be canceled? No, here the auth_user_authenticate() RPC is over (else the initiator wouldn't have destroyed the rendezvous port). The RPC that could get canceled is auth_server_authenticate(). Mmm, that makes me realize that this could be the link with ext2fs indeed. I actually believe that's already the case: libports remembers the notification request and disables it on RPC termination, which should be enough for itself. What I still don't understand is the link between auth requesting a notification and ext2fs also getting the EINTR. As Fredrik said and as I understand, while the server side of auth_server_authenticate RPC is waiting for the signal from the auth_user_authenticate, auth_server_authenticate RPC can be canceled by the deadname notification if the client gets scheduled first and destroys the rendezvous port and thus the auth_server_authenticate RPC returns EINTR. Right. The right thing to do is that before the server side of auth_user_authenticate RPC returns, it cancels the deadname notification. Mmm, thinking again about it, while I said in a previous mail that I tried and it failed, that was with the original version, not with my patched version. In the original version, we do let the auth_user_authenticate RPC return before hurd_condition_wait wakes in the auth_server_authenticate RPC server, which will bring the interrupt. In this case, auth_server_authenticate RPC can never be interrupted by the deadname notification any more even if the rendezvous port is destroyed before auth_server_authenticate RPC gets its chance to run. Right. So it should work, good! Samuel
Re: Reauthentication implementation flaw due to EINTR
On 09-12-28 下午7:21, Samuel Thibault wrote: Da Zheng, le Mon 28 Dec 2009 19:03:26 +0800, a écrit : Doesn't it mean the RPC on userauth (namely, auth_user_authenticate) is to be canceled? No, here the auth_user_authenticate() RPC is over (else the initiator wouldn't have destroyed the rendezvous port). What do you mean by that? I was talking about which RPC will be canceled by the deadname notification requested in the server side of auth_server_authenticate. The RPC that could get canceled is auth_server_authenticate(). Mmm, that makes me realize that this could be the link with ext2fs indeed. I actually believe that's already the case: libports remembers the notification request and disables it on RPC termination, which should be enough for itself. What I still don't understand is the link between auth requesting a notification and ext2fs also getting the EINTR. As Fredrik said and as I understand, while the server side of auth_server_authenticate RPC is waiting for the signal from the auth_user_authenticate, auth_server_authenticate RPC can be canceled by the deadname notification if the client gets scheduled first and destroys the rendezvous port and thus the auth_server_authenticate RPC returns EINTR. Right. The right thing to do is that before the server side of auth_user_authenticate RPC returns, it cancels the deadname notification. Mmm, thinking again about it, while I said in a previous mail that I tried and it failed, that was with the original version, not with my patched version. In the original version, we do let the auth_user_authenticate RPC return before hurd_condition_wait wakes in the auth_server_authenticate RPC server, which will bring the interrupt. In this case, auth_server_authenticate RPC can never be interrupted by the deadname notification any more even if the rendezvous port is destroyed before auth_server_authenticate RPC gets its chance to run. Right. So it should work, good! Now the problem is how to cancel the notification request explicitly. Zheng Da
Re: Reauthentication implementation flaw due to EINTR
Da Zheng, le Mon 28 Dec 2009 19:43:11 +0800, a écrit : On 09-12-28 下午7:21, Samuel Thibault wrote: Da Zheng, le Mon 28 Dec 2009 19:03:26 +0800, a écrit : Doesn't it mean the RPC on userauth (namely, auth_user_authenticate) is to be canceled? No, here the auth_user_authenticate() RPC is over (else the initiator wouldn't have destroyed the rendezvous port). What do you mean by that? I was talking about which RPC will be canceled by the deadname notification requested in the server side of auth_server_authenticate. Right, but auth_user_authenticate can't be canceled since it's finished by the time the rendezvous port is destroyed. Mmm, thinking again about it, while I said in a previous mail that I tried and it failed, that was with the original version, not with my patched version. In the original version, we do let the auth_user_authenticate RPC return before hurd_condition_wait wakes in the auth_server_authenticate RPC server, which will bring the interrupt. In this case, auth_server_authenticate RPC can never be interrupted by the deadname notification any more even if the rendezvous port is destroyed before auth_server_authenticate RPC gets its chance to run. Right. So it should work, good! Now the problem is how to cancel the notification request explicitly. It should just be a matter of adding the support in libports that requests it from the kernel but also cleans it in the lists, as Frederik said. Samuel
Re: Reauthentication implementation flaw due to EINTR
On Sat, Dec 26, 2009 at 09:12:08PM +0100, Samuel Thibault wrote: I've checked again the result Carl Fredrik Hammar, le Sat 26 Dec 2009 19:58:12 +0100, a écrit : There is this issue as well, which I have fixed already in commit 041baa80 (and indeed seen cases where it helped), but that's not enough, because not only auth gets EINTR here and can fix things, but ext2fs also gets an EINTR but can't able to restart the call in iohelp_reauth since the rendez-vous port is dead and thus gets EINVAL. Why does this happen? I don't know, but I know for sure that auth_server_authenticate returns EINTR in ext2fs (and next call returns EINVAL), even if auth never returned EINTR. OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. If the notification request is canceled before auth_user_authenticate returns there should be no problem. I don't know how to do that, as libports doesn't seem to have a function for it. The primitive function would be mach_port_request_notification with a null port, but I assume there is other stuff to clean up. Alternatively, the notification can be handled separately by the auth server, perhaps by overriding libports's notify server. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
On Sun, Dec 27, 2009 at 08:56:21PM +0100, Carl Fredrik Hammar wrote: On Sat, Dec 26, 2009 at 09:12:08PM +0100, Samuel Thibault wrote: I've checked again the result Carl Fredrik Hammar, le Sat 26 Dec 2009 19:58:12 +0100, a écrit : There is this issue as well, which I have fixed already in commit 041baa80 (and indeed seen cases where it helped), but that's not enough, because not only auth gets EINTR here and can fix things, but ext2fs also gets an EINTR but can't able to restart the call in iohelp_reauth since the rendez-vous port is dead and thus gets EINVAL. Why does this happen? I don't know, but I know for sure that auth_server_authenticate returns EINTR in ext2fs (and next call returns EINVAL), even if auth never returned EINTR. OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). When rendezvous port dies we get the notification: ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. If the notification request is canceled before auth_user_authenticate returns there should be no problem. I don't know how to do that, as libports doesn't seem to have a function for it. The primitive function would be mach_port_request_notification with a null port, but I assume there is other stuff to clean up. Alternatively, the notification can be handled separately by the auth server, perhaps by overriding libports's notify server. Oh sorry, I did not think through how this affects your original solution. AFAICT, after auth_server_authenticate_reply returns _hurdsig_abort_rpcs can not hijack it, so your solution should work. But all this is a bit over my head so I can't tell for sure. Canceling the notification would be safer in this respect, if you can figure out how to do it. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Sun 27 Dec 2009 22:24:24 +0100, a écrit : AFAICT, after auth_server_authenticate_reply returns _hurdsig_abort_rpcs can not hijack it, so your solution should work. I've checked more in the kernel internals, maybe there is a race window, even if very unlikely. ext2fs is in auth_server_authenticate()'s receive part, let's assume it's already queued on the receive mqueue-imq_threads from ipc_mqueue_receive(). In that case, auth_server_authenticate_reply() finds it and sets ith_state to MACH_MSG_SUCCESS. When ipc_mqueue_receive wakes up, it finds that first, before checking whether ith_wait_result is THREAD_INTERRUPTED, so we win. However, can it be that ext2fs may have sent its message (triggering everything in auth etc.) but not yet blocked in ipc_mqueue_receive() when the auth_server_authenticate_reply comes? In that case, it may not have the time to receive the message before auth_user_authenticate() returns and the reply port gets dead, triggering an interrupt. I can see locks on spaces and ports, but in mach_msg_trap, it seems that at label slow_get_rcv_port we don't have any lock left on send part. Samuel
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Sun 27 Dec 2009 20:56:21 +0100, a écrit : If the notification request is canceled before auth_user_authenticate returns there should be no problem. Ok, I see. Samuel
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Sun 27 Dec 2009 20:56:21 +0100, a écrit : If the notification request is canceled before auth_user_authenticate returns there should be no problem. I tried a crude error_t err2 = mach_port_request_notification(mach_task_self(), rendezvous, MACH_NOTIFY_DEAD_NAME, 0, MACH_PORT_NULL, MACH_MSG_TYPE_MAKE_SEND_ONCE, old); after the complete if (hurd_condition_wait (s.wakeup, pending_lock) statement, with no luck: ext2fs got an EINTR anyway. Samuel
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Sun 27 Dec 2009 22:24:24 +0100, a écrit : OK, I think I have a vague picture of what is going on: ports_interrupt_self_on_port_death ports_interrupt_self_on_notification ports_interrupt_rpc_on_notification, which requests notification (to the same port as auth_server_authenticate). But in another task (auth vs ext2fs). That's where I've still not found what makes ext2fs return EINTR. When rendezvous port dies we get the notification: Here we seemingly see ext2fs get the notification, but why? ports_notify_server ports_do_mach_notify_dead_name ports_dead_name ports_interrupt_notified_rpcs hurd_thread_cancel (in glibc) _hurdsig_abort_rpcs, which does some funky stuff that seems to hijack any pending RPCs to make them return EINTR. Samuel
Re: Reauthentication implementation flaw due to EINTR
Hi, On Mon, Dec 21, 2009 at 08:43:12PM +0100, Samuel Thibault wrote: I had been noticing odd issues with sudo when it calls setresuid such, it took me some time to understand that there was a flaw in the reauthentication implementation: sudo calls setresuid(), which calls setauth(), which (for each FD such) allocates a rendez-vous port, calls io_reauthenticate() (RPC in the underlying FS which calls the auth_server_authenticate() RPC) and calls the auth_user_authenticate() RPC. These last two RPCs end up in auth, which uses the rendez-vous port passed along to make the match. Whichever arrives first leaves information and a condition variable for the other ; when the latter arrives, it fills its information and wakes the former. The issue is that currently, once the user part gets its passthrough port from the server part, it returns immediately, and setauth() drops the rendez-vous port, which actually interrupts the server RPCs because the rendez-vous sender right becomes dead. Quite often scheduling makes it so that the user is not so fast and the server has time to finish its duties, but due to the high usage of setresuid in sudo, one every few tens of sudo calls fail. Is it the code below from S_auth_server_authenticate the problem? /* Store the new port and wait for the user RPC to wake us up. */ s.passthrough = newport; condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); if (hurd_condition_wait (s.wakeup, pending_lock)) /* We were interrupted; remove our record. */ { hurd_ihash_locp_remove (pending_servers, s.locp); err = EINTR; } That is, does hurd_condition_wait get canceled even though the condition was signaled before it was canceled? The fix I'll use at least on the Debian buildds for now is to make the auth_user_authenticate() RPC always wait for auth_server_authenticate() to have called auth_server_authenticate_reply() before returning. I've been running that in a tight loop the whole afternoon with no issue, so at least it seems to work much better. However, I'd prefer to make sure that it works _always_ :) So my question is: is it sufficient to make the user part wait for auth_server_authenticate_reply() call completion before freeing the rendez-vous port, to make sure that auth_server_authenticate() will never return EINTR because of the death of the rendez-vous port? Of course, the rendez-vous port can become dead in the io_reauthenticate() RPC, but that shouldn't be a problem. If what I said above is correct then this would indeed fix it. The question is if there are better ways to fix it. Perhaps trying to determine if the condition was signaled before the interrupt? Though I don't know if can be interrupted for other reasons and should always return EINTR in those cases. Perhaps the mistake is using interrupts as a signal for port deaths in the first place? And bonus question: are there other places where we have such rendez-vous port which might become dead too early and that would need the same fix? Don't know. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
Carl Fredrik Hammar, le Sat 26 Dec 2009 18:47:51 +0100, a écrit : Is it the code below from S_auth_server_authenticate the problem? /* Store the new port and wait for the user RPC to wake us up. */ s.passthrough = newport; condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); if (hurd_condition_wait (s.wakeup, pending_lock)) /* We were interrupted; remove our record. */ { hurd_ihash_locp_remove (pending_servers, s.locp); err = EINTR; } That is, does hurd_condition_wait get canceled even though the condition was signaled before it was canceled? There is this issue as well, which I have fixed already in commit 041baa80 (and indeed seen cases where it helped), but that's not enough, because not only auth gets EINTR here and can fix things, but ext2fs also gets an EINTR but can't able to restart the call in iohelp_reauth since the rendez-vous port is dead and thus gets EINVAL. Samuel
Re: Reauthentication implementation flaw due to EINTR
On Sat, Dec 26, 2009 at 06:56:48PM +0100, Samuel Thibault wrote: Carl Fredrik Hammar, le Sat 26 Dec 2009 18:47:51 +0100, a écrit : Is it the code below from S_auth_server_authenticate the problem? /* Store the new port and wait for the user RPC to wake us up. */ s.passthrough = newport; condition_init (s.wakeup); ports_interrupt_self_on_port_death (serverauth, rendezvous); if (hurd_condition_wait (s.wakeup, pending_lock)) /* We were interrupted; remove our record. */ { hurd_ihash_locp_remove (pending_servers, s.locp); err = EINTR; } That is, does hurd_condition_wait get canceled even though the condition was signaled before it was canceled? There is this issue as well, which I have fixed already in commit 041baa80 (and indeed seen cases where it helped), but that's not enough, because not only auth gets EINTR here and can fix things, but ext2fs also gets an EINTR but can't able to restart the call in iohelp_reauth since the rendez-vous port is dead and thus gets EINVAL. Why does this happen? I can't find any setup code for getting interrupts like in auth. Is it perhaps a feature of MIG or Mach? Can't it be turned off? Does the translator get interrupted even if the rendezvous port dies between the time auth sends its reply, and the translator receives it? Your fix wouldn't handle such a case, I think. If this is a problem, I think the only way to fix it is at the protocol level. Say if io_reauthenticate actually waits for a reply, which would signal that the translator is done using the rendezvous. That is: Client: Server: Auth: auth_user_auth - ... io_reauth -- auth_server_auth - ... ... io_reauth_reply - auth_server_auth_reply ... auth_user_auth_reply This would be a more robust protocol IMHO. It could even handle the case where client and server uses different auth servers for some reason (which is actually what made me think of it originally). The implementation of auth could also be simplified since it can assume that the client contacts it before the server. Of course, changing the auth protocol would lead to painful upgrades for users, so it might not be worth it. Regards, Fredrik
Re: Reauthentication implementation flaw due to EINTR
I've checked again the result Carl Fredrik Hammar, le Sat 26 Dec 2009 19:58:12 +0100, a écrit : There is this issue as well, which I have fixed already in commit 041baa80 (and indeed seen cases where it helped), but that's not enough, because not only auth gets EINTR here and can fix things, but ext2fs also gets an EINTR but can't able to restart the call in iohelp_reauth since the rendez-vous port is dead and thus gets EINVAL. Why does this happen? I don't know, but I know for sure that auth_server_authenticate returns EINTR in ext2fs (and next call returns EINVAL), even if auth never returned EINTR. I can't find any setup code for getting interrupts like in auth. Is it perhaps a feature of MIG or Mach? Possibly. Samuel