Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Sun, 24 Feb 2008 11:29:06 +0100 [EMAIL PROTECTED] (Lars Munch) babbled: On Sun, Feb 24, 2008 at 07:48:03PM +1100, Carsten Haitzler wrote: On Sun, 24 Feb 2008 15:43:18 +1100 Carsten Haitzler (The Rasterman) [EMAIL PROTECTED] babbled: actually - found a problem. breaks entrance it seems and enlightenment when init is enabled! :) back! Thanks for testing, too bad it didn't work out as expected. I do not use entrance and have init disabled in enlightenment, so everything was working flawlessly here :/ Anyway, I just did some more testing. It seems that using pselect we have a bigger chance of losing signals. If we get the same signal twice, while not waiting in the pselect call, then only one signal will be handled at the time pselect is called. I guess this could cause the breakage. Do you think that's the issue (I don't know the entrance nor the init code) ? I have no idea how to solve this, except for going back to the pipe solution. entrance and e both use signals. entranced waits for SIGUSR1 from x to know x is ready. e waits for something similar with a pause() from the init splash process. no singal ever arrives. pause() is never interrupted :) -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Sun, 24 Feb 2008 15:43:18 +1100 Carsten Haitzler (The Rasterman) [EMAIL PROTECTED] babbled: actually - found a problem. breaks entrance it seems and enlightenment when init is enabled! :) back! On Tue, 29 Jan 2008 15:03:45 +0100 [EMAIL PROTECTED] (Lars Munch) babbled: On Sat, Jan 26, 2008 at 01:16:24PM -0600, Nathan Ingersoll wrote: I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. Ok, you got me convinced. Attached is a pselect version of the race fix. Two questions remain: 1. Do we want to keep all signals blocked except in the pselect call or do we want to unblock signals after the pselect call? 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? well i've applied this locally and am testing - it seems to work just fine. i'm probably going to commit this to cvs today and let others then play - we can revert/fix if we find something, but overall it seems to work. -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Sun, Feb 24, 2008 at 07:48:03PM +1100, Carsten Haitzler wrote: On Sun, 24 Feb 2008 15:43:18 +1100 Carsten Haitzler (The Rasterman) [EMAIL PROTECTED] babbled: actually - found a problem. breaks entrance it seems and enlightenment when init is enabled! :) back! Thanks for testing, too bad it didn't work out as expected. I do not use entrance and have init disabled in enlightenment, so everything was working flawlessly here :/ Anyway, I just did some more testing. It seems that using pselect we have a bigger chance of losing signals. If we get the same signal twice, while not waiting in the pselect call, then only one signal will be handled at the time pselect is called. I guess this could cause the breakage. Do you think that's the issue (I don't know the entrance nor the init code) ? I have no idea how to solve this, except for going back to the pipe solution. -- Lars Munch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Tue, 29 Jan 2008 15:03:45 +0100 [EMAIL PROTECTED] (Lars Munch) babbled: On Sat, Jan 26, 2008 at 01:16:24PM -0600, Nathan Ingersoll wrote: I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. Ok, you got me convinced. Attached is a pselect version of the race fix. Two questions remain: 1. Do we want to keep all signals blocked except in the pselect call or do we want to unblock signals after the pselect call? 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? well i've applied this locally and am testing - it seems to work just fine. i'm probably going to commit this to cvs today and let others then play - we can revert/fix if we find something, but overall it seems to work. -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 29, 2008 8:03 AM, Lars Munch [EMAIL PROTECTED] wrote: On Sat, Jan 26, 2008 at 01:16:24PM -0600, Nathan Ingersoll wrote: I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. Ok, you got me convinced. Attached is a pselect version of the race fix. Two questions remain: 1. Do we want to keep all signals blocked except in the pselect call or do we want to unblock signals after the pselect call? You don't want to maintain the block through the pselect call because signals should cause it to return before it's expiration time is reached, so I would block except when inside pselect. 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? I believe Vincent is working on a library of win32 work-arounds, maybe he can propose an alternative call for that case. If not, then the pipe method described previously may need to be implemented for win32, or we live with a race condition by emulating pselect like glibc does. Thanks, Nathan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
I don't see why not, as it should be limited to a single package. Do you know of anything else that needs to handle this race condition between select and signals? On Jan 29, 2008 10:26 AM, Vincent Torri [EMAIL PROTECTED] wrote: On Tue, 29 Jan 2008, Nathan Ingersoll wrote: Oh, even better. I would probably just add a check to the configure.in for pselect, and then ifdef to select without the signal mask. will Mike accept all those #defines ? :p Vincent - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Tue, 29 Jan 2008, Nathan Ingersoll wrote: I don't see why not, as it should be limited to a single package. Do you know of anything else that needs to handle this race condition between select and signals? well, as we discussed on irc, according to Bart Massey, there might be a race condition with xcb_poll_for_event in ecore_xcb, but I don't know if it's the same kind of race or not. Vincent On Jan 29, 2008 10:26 AM, Vincent Torri [EMAIL PROTECTED] wrote: On Tue, 29 Jan 2008, Nathan Ingersoll wrote: Oh, even better. I would probably just add a check to the configure.in for pselect, and then ifdef to select without the signal mask. will Mike accept all those #defines ? :p Vincent - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel -- Ce message a été vérifié par MailScanner pour des virus ou des polluriels et rien de suspect n'a été trouvé. Message délivré par le serveur de messagerie de l'Université d'Evry. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Tue, Jan 29, 2008 at 09:28:11AM -0600, Nathan Ingersoll wrote: On Jan 29, 2008 9:14 AM, Lars Munch [EMAIL PROTECTED] wrote: Ok, I might have asked the question wrongly. My current patch already calls pselect with an empty set of signals, so signals will only be handled when pselect is called. As an alternative we could do: _ecore_main_select() 1. block signals 2. check signals 3. call pselect with an empty set 4. unblock signals that way we can receive/handle signals at (almost) any time (like we do now). The first method has the benefit that we always handle signals at the same place and that we can remove a lot of the signal checking stuff from the main loop. The second method has the benefit that the signal handler is called almost immediately. Ah, I understand your question now. When we receive the signals now, don't we defer the delivery of those signals anyways? I lean towards simplifying the code if we can, since signals are asynchronous and shouldn't be counted on to be delivered in a timely manner, just in-order. Ok, I will stick with the simple solution and clean up all the signaling checks in the main loop. 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? I believe Vincent is working on a library of win32 work-arounds, maybe he can propose an alternative call for that case. If not, then the pipe method described previously may need to be implemented for win32, or we live with a race condition by emulating pselect like glibc does. Signals do not exists on windows, so there is no race. The question is only if we could use #ifdefs or make a pselect windows function without the signal stuff using select. Oh, even better. I would probably just add a check to the configure.in for pselect, and then ifdef to select without the signal mask. Ok, I will try that. Thanks Lars Munch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 29, 2008 11:15 AM, Vincent Torri [EMAIL PROTECTED] wrote: well, as we discussed on irc, according to Bart Massey, there might be a race condition with xcb_poll_for_event in ecore_xcb, but I don't know if it's the same kind of race or not. Vincent It's difficult to know because his response was so vague. I would doubt that it would be the same race with signals, as I don't see a good reason for XCB to be managing signal handling. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Sat, Jan 26, 2008 at 01:16:24PM -0600, Nathan Ingersoll wrote: I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. Ok, you got me convinced. Attached is a pselect version of the race fix. Two questions remain: 1. Do we want to keep all signals blocked except in the pselect call or do we want to unblock signals after the pselect call? 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? Thanks Lars Munch Index: src/lib/ecore/ecore_main.c === RCS file: /var/cvs/e/e17/libs/ecore/src/lib/ecore/ecore_main.c,v retrieving revision 1.33 diff -u -r1.33 ecore_main.c --- src/lib/ecore/ecore_main.c 26 Jan 2008 10:11:48 - 1.33 +++ src/lib/ecore/ecore_main.c 29 Jan 2008 13:39:27 - @@ -284,34 +284,35 @@ static int _ecore_main_select(double timeout) { - struct timeval tv, *t; - fd_set rfds, wfds, exfds; - intmax_fd; - intret; - Ecore_List2*l; + sigset_temptyset; + struct timespec ts, *t; + fd_set rfds, wfds, exfds; + int max_fd; + int ret; + Ecore_List2 *l; t = NULL; if ((!finite(timeout)) || (timeout == 0.0)) /* finite() tests for NaN, too big, too small, and infinity. */ { - tv.tv_sec = 0; - tv.tv_usec = 0; - t = tv; + ts.tv_sec = 0; + ts.tv_nsec = 0; + t = ts; } else if (timeout 0.0) { - int sec, usec; + int sec, nsec; #ifdef FIX_HZ timeout += (0.5 / HZ); sec = (int)timeout; - usec = (int)((timeout - (double)sec) * 100); + nsec = (int)((timeout - (double)sec) * 10); #else sec = (int)timeout; - usec = (int)((timeout - (double)sec) * 100); + nsec = (int)((timeout - (double)sec) * 10); #endif - tv.tv_sec = sec; - tv.tv_usec = usec; - t = tv; + ts.tv_sec = sec; + ts.tv_nsec = nsec; + t = ts; } max_fd = 0; FD_ZERO(rfds); @@ -350,7 +351,9 @@ } } if (_ecore_signal_count_get()) return -1; - ret = select(max_fd + 1, rfds, wfds, exfds, t); + sigemptyset(emptyset); + ret = pselect(max_fd + 1, rfds, wfds, exfds, t, emptyset); + if (ret 0) { if (errno == EINTR) return -1; Index: src/lib/ecore/ecore_signal.c === RCS file: /var/cvs/e/e17/libs/ecore/src/lib/ecore/ecore_signal.c,v retrieving revision 1.35 diff -u -r1.35 ecore_signal.c --- src/lib/ecore/ecore_signal.c 26 Aug 2007 11:17:21 - 1.35 +++ src/lib/ecore/ecore_signal.c 29 Jan 2008 13:39:27 - @@ -113,10 +113,37 @@ void _ecore_signal_init(void) { + sigset_t blockset; + int ret; #ifdef SIGRTMIN int i, num = SIGRTMAX - SIGRTMIN; #endif + sigemptyset(blockset); + sigaddset(blockset, SIGPIPE); + sigaddset(blockset, SIGALRM); + sigaddset(blockset, SIGCHLD); + sigaddset(blockset, SIGUSR1); + sigaddset(blockset, SIGUSR2); + sigaddset(blockset, SIGHUP); + sigaddset(blockset, SIGQUIT); + sigaddset(blockset, SIGINT); + sigaddset(blockset, SIGTERM); +#ifdef SIGPWR + sigaddset(blockset, SIGPWR); +#endif + +#ifdef SIGRTMIN + for (i = 0; i num; i++) + sigaddset(blockset, SIGRTMIN + i); +#endif + + sigprocmask(SIG_BLOCK, blockset, NULL); + _ecore_signal_callback_set(SIGPIPE, _ecore_signal_callback_ignore); _ecore_signal_callback_set(SIGALRM, _ecore_signal_callback_ignore); _ecore_signal_callback_set(SIGCHLD, _ecore_signal_callback_sigchld); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Tue, Jan 29, 2008 at 08:48:42AM -0600, Nathan Ingersoll wrote: On Jan 29, 2008 8:03 AM, Lars Munch [EMAIL PROTECTED] wrote: On Sat, Jan 26, 2008 at 01:16:24PM -0600, Nathan Ingersoll wrote: I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. Ok, you got me convinced. Attached is a pselect version of the race fix. Two questions remain: 1. Do we want to keep all signals blocked except in the pselect call or do we want to unblock signals after the pselect call? You don't want to maintain the block through the pselect call because signals should cause it to return before it's expiration time is reached, so I would block except when inside pselect. Ok, I might have asked the question wrongly. My current patch already calls pselect with an empty set of signals, so signals will only be handled when pselect is called. As an alternative we could do: _ecore_main_select() 1. block signals 2. check signals 3. call pselect with an empty set 4. unblock signals that way we can receive/handle signals at (almost) any time (like we do now). The first method has the benefit that we always handle signals at the same place and that we can remove a lot of the signal checking stuff from the main loop. The second method has the benefit that the signal handler is called almost immediately. 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? I believe Vincent is working on a library of win32 work-arounds, maybe he can propose an alternative call for that case. If not, then the pipe method described previously may need to be implemented for win32, or we live with a race condition by emulating pselect like glibc does. Signals do not exists on windows, so there is no race. The question is only if we could use #ifdefs or make a pselect windows function without the signal stuff using select. Thanks for your fast feedback Lars Munch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 29, 2008 9:14 AM, Lars Munch [EMAIL PROTECTED] wrote: Ok, I might have asked the question wrongly. My current patch already calls pselect with an empty set of signals, so signals will only be handled when pselect is called. As an alternative we could do: _ecore_main_select() 1. block signals 2. check signals 3. call pselect with an empty set 4. unblock signals that way we can receive/handle signals at (almost) any time (like we do now). The first method has the benefit that we always handle signals at the same place and that we can remove a lot of the signal checking stuff from the main loop. The second method has the benefit that the signal handler is called almost immediately. Ah, I understand your question now. When we receive the signals now, don't we defer the delivery of those signals anyways? I lean towards simplifying the code if we can, since signals are asynchronous and shouldn't be counted on to be delivered in a timely manner, just in-order. 2. pselect breaks the win32 port. what is the best way to handle this? implement our own pselect for win32 using select or use #ifdef's ? I believe Vincent is working on a library of win32 work-arounds, maybe he can propose an alternative call for that case. If not, then the pipe method described previously may need to be implemented for win32, or we live with a race condition by emulating pselect like glibc does. Signals do not exists on windows, so there is no race. The question is only if we could use #ifdefs or make a pselect windows function without the signal stuff using select. Oh, even better. I would probably just add a check to the configure.in for pselect, and then ifdef to select without the signal mask. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Fri, Jan 25, 2008 at 07:43:47AM -0300, Gustavo Sverzut Barbieri wrote: On Jan 25, 2008 12:40 AM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: On Fri, 25 Jan 2008 00:11:41 -0300 Gustavo Sverzut Barbieri [EMAIL PROTECTED] babbled: On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? that is the question - is it implemented kernel-wise widely enough to use it? or do we just stick to the old-fashioned self-pipe trick? I have not audited the Linux or GlibC code to check if that's the case (well implemented), but I truly believe that we can rely on this call for newer ( 2.6) kernels. Thanks for your comments and suggestions. I thinks I'll go with the pipe solution as I think there are too many unknowns with pselect: will it work if we build againts uclibc, newlib or klibc? or build on BSD or Solaris systems etc? Regards -- Lars Munch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 26, 2008 6:13 AM, Lars Munch [EMAIL PROTECTED] wrote: On Fri, Jan 25, 2008 at 07:43:47AM -0300, Gustavo Sverzut Barbieri wrote: On Jan 25, 2008 12:40 AM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: On Fri, 25 Jan 2008 00:11:41 -0300 Gustavo Sverzut Barbieri [EMAIL PROTECTED] babbled: On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? that is the question - is it implemented kernel-wise widely enough to use it? or do we just stick to the old-fashioned self-pipe trick? I have not audited the Linux or GlibC code to check if that's the case (well implemented), but I truly believe that we can rely on this call for newer ( 2.6) kernels. Thanks for your comments and suggestions. I thinks I'll go with the pipe solution as I think there are too many unknowns with pselect: will it work if we build againts uclibc, newlib or klibc? or build on BSD or Solaris systems etc? Solaris was already said to work by Nathan, Linux kernel implements it, so uclibc and newlibc or klibc should support it, if not let's bug report and have it done ASAP. BSD I'm not sure, any users out there to reply about this (checking the man page should be enough)? -- Gustavo Sverzut Barbieri -- Jabber: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] ICQ#: 17249123 Skype: gsbarbieri Mobile: +55 (81) 9927 0010 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
I checked the man page for Mac OS X as well. Looks like pselect() comes from FreeBSD in that case. On Jan 26, 2008 9:09 AM, Gustavo Sverzut Barbieri [EMAIL PROTECTED] wrote: On Jan 26, 2008 6:13 AM, Lars Munch [EMAIL PROTECTED] wrote: On Fri, Jan 25, 2008 at 07:43:47AM -0300, Gustavo Sverzut Barbieri wrote: On Jan 25, 2008 12:40 AM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: On Fri, 25 Jan 2008 00:11:41 -0300 Gustavo Sverzut Barbieri [EMAIL PROTECTED] babbled: On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? that is the question - is it implemented kernel-wise widely enough to use it? or do we just stick to the old-fashioned self-pipe trick? I have not audited the Linux or GlibC code to check if that's the case (well implemented), but I truly believe that we can rely on this call for newer ( 2.6) kernels. Thanks for your comments and suggestions. I thinks I'll go with the pipe solution as I think there are too many unknowns with pselect: will it work if we build againts uclibc, newlib or klibc? or build on BSD or Solaris systems etc? Solaris was already said to work by Nathan, Linux kernel implements it, so uclibc and newlibc or klibc should support it, if not let's bug report and have it done ASAP. BSD I'm not sure, any users out there to reply about this (checking the man page should be enough)? -- Gustavo Sverzut Barbieri -- Jabber: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] ICQ#: 17249123 Skype: gsbarbieri Mobile: +55 (81) 9927 0010 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 25, 2008 12:40 AM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: On Fri, 25 Jan 2008 00:11:41 -0300 Gustavo Sverzut Barbieri [EMAIL PROTECTED] babbled: On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? that is the question - is it implemented kernel-wise widely enough to use it? or do we just stick to the old-fashioned self-pipe trick? I have not audited the Linux or GlibC code to check if that's the case (well implemented), but I truly believe that we can rely on this call for newer ( 2.6) kernels. -- Gustavo Sverzut Barbieri -- Jabber: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] ICQ#: 17249123 Skype: gsbarbieri Mobile: +55 (81) 9927 0010 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
[E-devel] [PATCH][RFC] signal/select race problem in ecore_main
Hi While working on suspend/resume on my embedded system I ran into to the race problem between select and signals. See this link for a description of the problem: http://www.xs4all.nl/~evbergen/unix-signals.html The problem is that the following code in ecore_main_select is not an atomic operation and could end up in select waiting forever even though there is a signal to be served and put in the event queue: if (_ecore_signal_count_get()) return -1; ret = select(max_fd + 1, rfds, wfds, exfds, t); My proposed solution (see attached patch) is something similar to that described in above link, namely to create a pipe to flag a signal arrival to select. NOTE: the attached patch currently only handles sigusr1 and sigusr2 and has almost no error checking. As far as I can tell, the attached patch still have a race around sig_count and sigXXX_count. I thought about adding the signal number to the signal pipe to avoid this race, but that solution could result in pipe buffer overflow and then signals would get lost. Before I continue working on this patch, I really like your comments and suggestions. Thanks -- Lars Munch Index: src/lib/ecore/ecore_signal.c === RCS file: /var/cvs/e/e17/libs/ecore/src/lib/ecore/ecore_signal.c,v retrieving revision 1.35 diff -u -r1.35 ecore_signal.c --- src/lib/ecore/ecore_signal.c 26 Aug 2007 11:17:21 - 1.35 +++ src/lib/ecore/ecore_signal.c 23 Jan 2008 15:53:28 - @@ -60,6 +60,9 @@ static volatile siginfo_t *sigrt_info = NULL; #endif +static int pipe_fd[2]; +static Ecore_Fd_Handler *pipe_handler; + void _ecore_signal_shutdown(void) { @@ -110,13 +113,52 @@ #endif } +static void +_ecore_signal_pipe_fd_flag() +{ + int count; + char f = 1; + + /* Empty signal pipe completely */ + for(count = 0; read(pipe_fd[0], f, sizeof(f)) 0; count++) ; + + /* Put one flag into signal pipe */ + write(pipe_fd[1], f, sizeof(f)); +} + +static int +_ecore_signal_pipe_fb_callback(void *data, Ecore_Fd_Handler *fdh) +{ + int count; + char f; + + /* Empty signal pipe completely */ + for(count = 0; read(fdh-fd, f, sizeof(f)) 0; count++) ; + + if(count) + _ecore_signal_call(); + + return 1; +} + void _ecore_signal_init(void) { + int ret; #ifdef SIGRTMIN int i, num = SIGRTMAX - SIGRTMIN; #endif + ret = pipe(pipe_fd); + assert(!ret); + + fcntl(pipe_fd[0], F_SETFL, O_NONBLOCK); + fcntl(pipe_fd[1], F_SETFL, O_NONBLOCK); + + pipe_handler = ecore_main_fd_handler_add(pipe_fd[0], ECORE_FD_READ, + _ecore_signal_pipe_fb_callback, + NULL, NULL, NULL); + _ecore_signal_callback_set(SIGPIPE, _ecore_signal_callback_ignore); _ecore_signal_callback_set(SIGALRM, _ecore_signal_callback_ignore); _ecore_signal_callback_set(SIGCHLD, _ecore_signal_callback_sigchld); @@ -401,6 +443,8 @@ else sigusr1_info.si_signo = 0; + _ecore_signal_pipe_fd_flag(); + sigusr1_count++; sig_count++; } @@ -413,6 +457,8 @@ else sigusr2_info.si_signo = 0; + _ecore_signal_pipe_fd_flag(); + sigusr2_count++; sig_count++; } - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
I think the proper fix is to use sigprocmask() disable signals, then rather than using select(), call pselect() which takes a signal mask and will atomically use that mask during the select. This should fix the race condition and force signals to be processed simultaneously to file descriptors. On Jan 24, 2008 5:03 AM, Lars Munch [EMAIL PROTECTED] wrote: Hi While working on suspend/resume on my embedded system I ran into to the race problem between select and signals. See this link for a description of the problem: http://www.xs4all.nl/~evbergen/unix-signals.html The problem is that the following code in ecore_main_select is not an atomic operation and could end up in select waiting forever even though there is a signal to be served and put in the event queue: if (_ecore_signal_count_get()) return -1; ret = select(max_fd + 1, rfds, wfds, exfds, t); My proposed solution (see attached patch) is something similar to that described in above link, namely to create a pipe to flag a signal arrival to select. NOTE: the attached patch currently only handles sigusr1 and sigusr2 and has almost no error checking. As far as I can tell, the attached patch still have a race around sig_count and sigXXX_count. I thought about adding the signal number to the signal pipe to avoid this race, but that solution could result in pipe buffer overflow and then signals would get lost. Before I continue working on this patch, I really like your comments and suggestions. Thanks -- Lars Munch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Thu, 24 Jan 2008 12:03:26 +0100 [EMAIL PROTECTED] (Lars Munch) babbled: Hi While working on suspend/resume on my embedded system I ran into to the race problem between select and signals. See this link for a description of the problem: http://www.xs4all.nl/~evbergen/unix-signals.html The problem is that the following code in ecore_main_select is not an atomic operation and could end up in select waiting forever even though there is a signal to be served and put in the event queue: if (_ecore_signal_count_get()) return -1; ret = select(max_fd + 1, rfds, wfds, exfds, t); My proposed solution (see attached patch) is something similar to that described in above link, namely to create a pipe to flag a signal arrival to select. NOTE: the attached patch currently only handles sigusr1 and sigusr2 and has almost no error checking. As far as I can tell, the attached patch still have a race around sig_count and sigXXX_count. I thought about adding the signal number to the signal pipe to avoid this race, but that solution could result in pipe buffer overflow and then signals would get lost. Before I continue working on this patch, I really like your comments and suggestions. oh yeah. very good point. and yes - a race condition - a very small one, but there nevertheless. now nathan's solution - sigmask and pselect. that could work. problem is - pselect. how will that affect us. not sure. never touched it. the pipe/fd solution should work. we dont use it for delivery of signals ONLY to wake select up. just a dumb fd we write a single byte to in the sig handler. that means select should go in and then instantly come out if there was a signal. we can have another global flags of sig_pipe_write. i.e sig_pipe_write++; if (_ecore_signal_count_get()) return -1; et = select(max_fd + 1, rfds, wfds, exfds, t); sig_pipe_write--; and in signal handler add: ... unsigned char dummy_buf = 1; if (sig_pipe_write 0) write(signal_pipe, dummy_buf, 1); ... this means that whenever in select we also get a write on any signals that will cause select to wake up when it is next called. we will get extra select() wakeups tho (and wakeups for just the reason that the pipe buffer has data). of course i am assuming non-blocking writes to this signal pipe fd, so if they fail - thats ok. the fact the buffer has data is enough to wake select up and avoid the long term sleep bug. next time select returns the signal stuff will be fixed up and handled- it just means we get a possible delay (until something else wakes select up). using the pipe we force it to wake up. now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? -- Gustavo Sverzut Barbieri -- Jabber: [EMAIL PROTECTED] MSN: [EMAIL PROTECTED] ICQ#: 17249123 Skype: gsbarbieri Mobile: +55 (81) 9927 0010 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
Re: [E-devel] [PATCH][RFC] signal/select race problem in ecore_main
On Fri, 25 Jan 2008 00:11:41 -0300 Gustavo Sverzut Barbieri [EMAIL PROTECTED] babbled: On Jan 24, 2008 11:42 PM, Nathan Ingersoll [EMAIL PROTECTED] wrote: On Jan 24, 2008 7:18 PM, The Rasterman Carsten Haitzler [EMAIL PROTECTED] wrote: now i think this is a bit more generic a solution - but it adds overhead. so what about the pselect() method? anyone got input on that? Basically, pselect() is designed for exactly this situation. You block all of the signals you're going to handle during init or some other very early point, then you pass a mask of the signals you're going to unblock to pselect(). At this point, pselect() will atomically unblock the specified signals and call select() with the specified fd's, it also re-instates the original signal blocks after select() returns. Since this sequence is atomic, it prevents the race condition we currently have. Now the problem, this is a good solution on BSD and Solaris, but unfortunately Linux only fakes support for pselect() (unless this was fixed recently). On Linux pselect() is actually a wrapper exactly around the sequence sigprocmask(), select() sigprocmask(). So we still end up with a race condition between the first sigprocmask() call and the select() call. man page says: BUGS: Since version 2.1, glibc has provided an emulation of pselect() that is implemented using sigprocmask(2) and select(). This implementation remains vulnerable to the very race condition that pselect() was designed to prevent. On systems that lack pselect() reliable (and more portable) signal trapping can be achieved using the self-pipe trick (where a signal handler writes a byte to a pipe whose other end is monitored by select() in the main program.) however a bit earlier it says Linux has pselect(), and at least 2.6.23 implements it... so maybe this wrapper is just used as a fallback? that is the question - is it implemented kernel-wise widely enough to use it? or do we just stick to the old-fashioned self-pipe trick? -- - Codito, ergo sum - I code, therefore I am -- The Rasterman (Carsten Haitzler)[EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel