On Fri, 10 Dec 2010 14:56:34 +0900
Carsten Haitzler (The Rasterman) <ras...@rasterman.com> wrote:

> On Thu, 09 Dec 2010 13:36:14 +0900 Mike McCormack <mj.mccorm...@samsung.com>
> said:
> 
> > On 12/09/2010 11:57 AM, Mike Blumenkrantz wrote:
> > 
> > >>                _ecore_main_loop_shutdown();
> > >>                _ecore_main_loop_init();
> > >>             }
> > 
> > > Okay, but the only place that _ecore_main_fdh_poll_del can be called from
> > > is ecore_main_fd_handler_del.  If a bad fd gets into epoll and then is
> > > selected on without being removed, it will just loop infinitely without
> > > ever being able to be removed, just like we're seeing in this bug.  Might
> > > be good to add shutdown+init inside bads_rem if no bads are found, yes?
> > 
> > Getting a stale fd in epoll is a bug.  It means somebody closed the fd
> > without removing it from ecore_main_loop, and that should be fixed.
> > 
> > Rather than adding more bandaids, it would be better to fix the root cause
> > of the problem.
> 
> indeed you are right. what would be RIGHT to do here is like the rest of EFL
> and where it has magic number checks. this is similar. when it encounters such
> a bad fd ecore could/should blurt out an ugly warning and if ECORE_ERROR_ABORT
> is set in the env call abort instantly there - thus providing easy debugging.
> it CAN now clean up and march on after a rebuild. it gave the developer a good
> chance to fix things. at least it is robust enough to march on with no side
> effects from that point.
I agree that this would be a nice feature.
> zmike: you may want to change your proposal to be like the above. i'd be more
> comfortable with it that way. it solves the "bad userspace code" problem as
> well as providing good debuggability (mike's point).
Again, the problem here is that making ecore just die when this happens IN NO
WAY HELPS A DEVELOPER FIX THE BUG.  You have no way of knowing which
app/module/etc has caused the behavior to occur (in the case of e17, for
example), and so your only recourse is to search through literally anything
that uses fds with ecore.  This is not the correct way to fix it, as it only
creates more headaches.

If there were some way to make it abort AND log which fd has been improperly
closed, that would be the only way of doing it so that a developer can debug it.
This could be done by making ecore_main_fd_handler_add into a macro which uses
__PRETTY_FUNCTION__ (or similar based on arch/compiler) to supply the actual
function name along with the fd to avoid needing to rewrite any code, but this
would likely introduce other bugs?  Best way I can think of off the top of my
head though.

Either way, this is a bug which MUST be fixed before ecore 1.0 since it can
break everything running on the machine when it occurs due to the insane rate
at which log files grow.
-- 
Mike Blumenkrantz
Zentific: We run the three-legged race individually.

------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to