Re: [E-devel] Ecore bug due EO integration

Gustavo Sverzut Barbieri Thu, 01 Nov 2012 20:57:16 -0700

On Thu, Nov 1, 2012 at 10:33 PM, Carsten Haitzler <ras...@rasterman.com>wrote:


> On Thu, 1 Nov 2012 18:53:32 -0200 Gustavo Sverzut Barbieri
> <barbi...@profusion.mobi> said:
>
> > Hi all,
> >
> > As I always complain, the problems with those "easy" solutions such as EO
> > macro and types hell is when you have incorrect usage. While the correct
> > path is all nice and simple, the incorrect is tricky to solve.
> >
> > Consider the following INCORRECT code:
> >
> > #include <Ecore.h>
> >
> > static Ecore_Animator *anim;
> >
> > static Eina_Bool _cb_anim(void *data)
> > {
> >     printf("_cb_anim\n");
> >     ecore_timer_del(anim); // note typo: TIMER not ANIMATOR
> >     ecore_main_loop_quit();
> >     return EINA_FALSE;
> > }
> >
> > int main(void)
> > {
> >     ecore_init();
> >
> >     anim = ecore_animator_add(_cb_anim, NULL);
> >     ecore_main_loop_begin();
> >     ecore_shutdown();
> >     return 0;
> > }
> >
> > It is wrong yet no compiler warning, not even with -Wall -Wextra. This by
> > itself is a bug and step back in development and debug help.
>
> actually as of eo this is just fine. it's not an error. :) it used to be
> one,
> now it's safe. or SHOULD be as ecore_animator_del and ecore_timer_del are
> just
> swappers around eo_del() and that's it. we're just deleting an object. all
> objects have destructors and this just calls it. calling parent class
> methods
> on a derived class object should be safe. the problem here is that they
> return
> something from the child - the data ptr from the timer (or animator) on
> del.
> that's how the old api worked. thus there's some actual type specific
> implementations here. the now wrapper func should be doing some tyype
> checking
> here at ecore_timer_del() time in this case.
>

yes, that would help (ecore_timer_* check for timer instances).



>
> > Then you go and execute it, it won't work:
> >
> > $ gcc -o t-eo t-eo.c `pkg-config --cflags --libs ecore` -Wall -Wextra
> > $ ./t-eo
> > _cb_anim
> > ERR<28351>:ecore ecore_timer.c:752 _ecore_timer_cleanup() 1 timers to
> > delete, but they were not found!Stats: todo=2, done=1, pending=1,
> in_use=0.
> > reset counter
> >
> > Let's gdb it:
> >
> > $ EINA_LOG_ABORT=1 EINA_LOG_ABORT_LEVEL=1 gdb ./t-eo
> > [...]
> > _cb_anim
> > ERR<28371>:ecore ecore_timer.c:752 _ecore_timer_cleanup() 1 timers to
> > delete, but they were not found!Stats: todo=2, done=1, pending=1,
> in_use=0.
> > reset counter.
> >
> > Program received signal SIGABRT, Aborted.
> > 0xb7fdd424 in __kernel_vsyscall ()
> > (gdb) bt
> > #0  0xb7fdd424 in __kernel_vsyscall ()
> > #1  0xb7d9c5ef in raise () from /usr/lib/libc.so.6
> > #2  0xb7d9ded3 in abort () from /usr/lib/libc.so.6
> > #3  0xb7f58349 in eina_log_print () from /usr/lib/libeina.so.1
> > #4  0xb7fabc2e in ?? () from /usr/lib/libecore.so.1
> > #5  0xb7fa8152 in ?? () from /usr/lib/libecore.so.1
> > #6  0xb7fa89a7 in ecore_main_loop_begin () from /usr/lib/libecore.so.1
> > #7  0x08048703 in main ()
> >
> > Not only we lost the compiler warning saying that the developer did shit,
> > it keeps the shit rolling... the error will come from the main loop
> > execution, after the call already returned.
> >
> > As now everything becomes one single Eo type, I'm unsure how to
> technically
> > make it raise the warning... but it should be there.
> >
> > For sure the timer/animator problem can be fixed internally (seems they
> are
> > just calling the function destructor), but what other error patterns
> we'll
> > fail so badly at?
> >
> > For instance I'm always remembering people about the following, that is a
> > STUPID development error, yet people will do it and will never get the
> bug:
> >
> >     Ecore_Timer *t = ecore_timer_add(1, _cb_anim, NULL);
> >     eo_do(t, ECORE_TIMER_ID(2), "hi there, what a bug?");
> >
> > The ECORE_TIMER_ID(2) is just to force the bug. It will silently compile
> > and run with the following error:
> >
> > ERR<28431>:eo eo.c:405 _eo_dov_internal() Can't find func for op bfffbaa4
> > ((null):(null)) for class 'ecore_timer'. Aborting.
> >
> > Now try to figure out with valgrind or gdb :-D
>
> set EINA_LOG_ABORT :) but let me giv an equally stupid example:
>

that's the thing, the log abort will just say inside _eo_dov_internal where
it's broken, which is pretty useless to debug! If you do:

eo_do(obj, x, y, z, ...);

you can't easily trace if it was about to execute x, y or z. You need to
somehow go back in time and figure out "ah, it did x and y, must be doing z
now".

if the bug is INSIDE x, then it may contain the internal implementation in
the stack (_x_internal_do() or similar). But if it's another mistake,
you're left with a stack you have no fscking idea how to read.

if you ever programmed in PostScript or other stack language you know how
badly it's to debug a very minor mistake.



>
> Ecore_Timer *t = ecore_timer_add(1, _cb_anim, NULL);
> double v = 10.0;
> ecore_timer_interval_set(t + 1, v);
>
> valgrind will catch that. gdb - probably not due to memory layout. valgrind
> wont catch it if we use mempools for timer objects. :) and in the case of
> catching via magic number u have the samr problem - set the abort env var.


with a single variable you disable the mempool and get straight malloc to
enable valgrind.

-- 
Gustavo Sverzut Barbieri
http://profusion.mobi embedded systems
--------------------------------------
MSN: barbi...@gmail.com
Skype: gsbarbieri
Mobile: +55 (19) 9225-2202
------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] Ecore bug due EO integration

Reply via email to