On 9/4/21 4:55 PM, Carsten Haitzler wrote:
On Sat, 4 Sep 2021 11:47:20 +0900 Florian Schaefer <list...@netego.de> said:Raster, Thanks for the quick reply and help! OK, so ibox seems to be the culprit. With the module unloaded I was not able to crash the system. That's quite interesting, on my personal machine I am using ibox ever since and never had any issues (just like your test yesterday). So this seems to be somehow specific to my new system here. Anyway, thanks for pointing me into the right direction. With this I now also finally understood how to identify which one of the many threads was the segfaulting one. ;-) Now for the backtrace. As it is quite short I will paste it below ======================================== (gdb) bt #0 0x00007f23b417f872 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29 #1 0x0000564440d159f7 in e_alert_show () at ../src/bin/e_alert.c:43 #2 0x0000564440cda47a in _e_crash () at ../src/bin/e_signals.c:81 #3 0x0000564440cda4a9 in e_sigseg_act (x=<optimized out>, info=<optimized out>, data=<optimized out>) at ../src/bin/e_signals.c:91 #4 0x00007f23b4180140 in <signal handler called> () at /lib/x86_64-linux-gnu/libpthread.so.0 #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at ../src/modules/ibox/e_mod_main.c:636 #6 0x00007f23a57df330 in _ibox_cb_icon_fill_timer (data=<optimized out>) at ../src/modules/ibox/e_mod_main.c:526 #7 0x00007f23b4c25581 in _ecore_call_task_cb (data=<optimized out>, func=<optimized out>) at ../src/lib/ecore/ecore_private.h:456 #8 _ecore_timer_legacy_tick (data=0x564441cbf230, event=0x7ffd43c61150) at ../src/lib/ecore/ecore_timer.c:172 #9 0x00007f23b3b1c130 in _event_callback_call (obj_id=0x400000379067, pd=0x5644412371e0, desc=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=<optimized out>, legacy_compare=legacy_compare@entry=0 '\000') at ../src/lib/eo/eo_base_class.c:2114 #10 0x00007f23b3b1c3ec in _efl_object_event_callback_call (obj_id=<optimized out>, pd=<optimized out>, desc=<optimized out>, event_info=<optimized out>) at ../src/lib/eo/eo_base_class.c:2186 #11 0x00007f23b3b16620 in efl_event_callback_call (obj=<optimized out>, desc=desc@entry=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=event_info@entry=0x0) at ../src/lib/eo/eo_base_class.c:2189 #12 0x00007f23b4c26e15 in _efl_loop_timer_expired_call (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, when=when@entry=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:669 #13 0x00007f23b4c26f43 in _efl_loop_timer_expired_timers_call (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, when=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:621 #14 0x00007f23b4bf2fae in _ecore_main_loop_iterate_internal (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, once_only=once_only@entry=0) at ../src/lib/ecore/ecore_main.c:2431 #15 0x00007f23b4bf383f in _ecore_main_loop_begin (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460) at ../src/lib/ecore/ecore_main.c:1231 #16 0x00007f23b4bf7e6d in _efl_loop_begin (obj=0x40000000012d, pd=0x5644411fd460) at ../src/lib/ecore/efl_loop.c:57 #17 0x00007f23b4bf7233 in efl_loop_begin (obj=0x40000000012d) at src/lib/ecore/efl_loop.eo.c:28 #18 0x00007f23b4bf390c in ecore_main_loop_begin () at ../src/lib/ecore/ecore_main.c:1316 #19 0x0000564440cb8c50 in main (argc=<optimized out>, argv=<optimized out>) at ../src/bin/e_main.c:1121 (gdb) fr 5 #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at ../src/modules/ibox/e_mod_main.c:636 636 if ((ic->ibox->inst->ci->show_preview) && (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) (gdb) list 631 } 632 633 static void 634 _ibox_icon_fill(IBox_Icon *ic) 635 { 636 if ((ic->ibox->inst->ci->show_preview) && (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) 637 _ibox_icon_fill_preview(ic, EINA_FALSE); 638 else 639 _ibox_icon_fill_icon(ic); 640 (gdb) print ic $1 = (IBox_Icon *) 0x5644419a2910 (gdb) print *ic $2 = {ibox = 0x564441cc3fe0, o_holder = 0x0, o_icon = 0x0, o_holder2 = 0x0, o_icon2 = 0x0, client = 0x0, drag = {start = 0 '\000', dnd = 0 '\000', x = 0, y = 0, dx = 0, dy = 128}} (gdb) print *(ic->ibox) $3 = {inst = 0x40, o_box = 0xe1, o_drop = 0x564441a499b0, o_drop_over = 0x7f23b4165cb0 <main_arena+304>, o_empty = 0x7474756200726162, ic_drop_before = 0x81646c3698761235, drop_before = 1103904792, icons = 0x0, zone = 0x698761254, dnd_x = 0, dnd_y = 1769170290} (gdb) print *(ic->ibox->inst) Cannot access memory at address 0x40 ======================================== So somehow we've got some garbage pointer in ic->ibox->inst.actualluy.. ic->ibox is junk. iut happens to point to some memory we can access but it's full of ... garbage. like dnd_y is and unrealistic coord. zone does not look like a proper pointer (o_drop does) and o_box is nothing like what a pointer should look like. drop_before seems junky too. so ... what happened to ic->ibox? or ... for that matter what happened to ic? maybe ic has been freed and now the ibox ptr has been overwritten to point to some junk as i cant imagine the ibox struct being freed as that struct is still there for the ibox gadget. so ...
Ah I see. It certainly makes debugging easier if you know what a pointer is supposed to look like. :-)
well turning on ASAN (search enlightenment.org for asan and how to enable it) in efl and e would probably instantly point out the problem. you can try that as an exercise in being able to divine better debug info from efl + e. it's pretty easy now with meson.... :) unlike valgrind it's not prohibitively slow either. it's usable day to day on a fast enough machine.
Interesting. Thanks for the pointer to new debugging tools. (And yes, valgrind is really slow.) I found the documentation you mentioned. I think I will give it a try before applying your patch, just to see what happens and to be able to play around with it for a bit.
and i can see the problem: ecore_timer_add(0.1, _ibox_cb_icon_fill_timer, ic); a timer is created to fill the icon in 0.1 sec... but ... imagine the icon (ic) has been freed/deleted BEFORE the timer fires... in 0.1sec from now. ... someone added a timer without remembering to delete it when the icon the timer is for is deleted! a bit sloppy...
Bad boy. ;-)This means that on my old laptop I never ran into any issues because it is just too slow for this race condition to occur?
d12acf0d01e628d71548adbb77670c7e40aef043 commit in git now fixes that. problem is in e ... not efl :)
Great. Thanks! As said before, I will try to tackle this with ASAN first for training and then see how your solution is holding up. That will hopefully be tomorrow.
Now to the second point of my first mail from yesterday: Is there any way for me to disable/silence the error popup on startup that no ACPI daemon is running?
Cheers, Florian
I tried to poke into the preceding frames (#6 and #7) but only hit optimized out variables. This is efl territory, right? This morning I recompiled enlightenment with "-O0 -g" but I guess I should also have done the same to efl. Well, I can do this the next time I'm in office if helpful. Any ideas? For now I gave ibar a try. Not exactly a replacement for me. I don't need a launcher (using everything and favorites menu instead) or a tracker of running windows (I know what windows I have open). I only need something to show my minimized windows so that I can open them again (I know, they appear with Alt+Tab...) and this seems to be the only scenario that cannot be reproduced by ibar. -- I guess I never bought into the MacOS style launcher bar. ;-)ibar will show both running and minimized icons for windows .. but ok - yeah - it doesnt "show only minimized"... :)Cheers Florian On 9/4/21 1:25 AM, Carsten Haitzler wrote:On Fri, 3 Sep 2021 21:04:35 +0900 Florian Schaefer <list...@netego.de> said: quick - if you unload the ibox module ... does the problem stop? that crash is inside ibox code - memory it's accessing is bad/wrong - why i don't know. not more information. like 363 in ibox is: if ((ic->ibox->inst->ci->show_preview) && (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) so what is ic? whats is ic->ibox, ic->ibox->inst, ic->ibox->inst->ci ? if you attach gdb when e crashes and dump these values - i'd know more. maybe. I actually stopped using ibox a while ago since ibar does both effectively these days. perhaps it is an ibox bug and i havent seen it as i dont use it. so try the above, if it goes away - attach gdb i can say that i dont see the problem here with ibox enabled and on amd + e (git).Dear everyone, so I got a new desktop PC at work and the first thing I did, of course, was to install Debian sid and enlightenment-git. ;-) The machine has a Nvidia T600 card and this is where troubles probably begin. As I kind of need the graphics performance for CAD I went with the drivers from Nvidia (the stock open source drivers were terribly slow). Now what happens is that enlightenment crashes often. Like kind of constantly. I got the impression it happens mostly when several windows are going through their appearance fade-in transition at the same time. Then the "red screen of death" appears and I need to press F1 to continue. With some applications this happens always (Eagle anyone?) with others only sometimes. After the forced restart many windows (e.g. terminology always, firefox sometimes) need to be minimized and uncovered again for their content to display again. Some dialog windows won't even show their content from the beginning and instead just some different portion of the screen. Needless to say that for a machine at work this is not an optimal situation. The most pressing issue are of course the crashes. I recompiled everything with debugging symbols and optimization disabled (or at least I thought so, some things seem still to be optimized away) to get some meaningful dumps. One of which I uploaded to pastebin (https://pastebin.com/YWSarC10) hoping that it makes sense to someone. I am sure that it is not E that is "at fault" but Nvidia, but for now I need to find a way around this so that I can work without having to reset everything every five minutes. Any ideas? Oh, I also tried to disable OpenGL in the compositor settings and choosing the software option. And it still crashes! For starters I was hoping that I can just switch off all the window transition-fading eye-candy but I did not understand whether this is possible. Is it? Finally, being a desktop system (my first in like 10 years or so) it does not run an acpi daemon. I don't really see any reason to do so. Therefore E also complains on every startup that no acpi daemon can be found. I did not find any compile time or runtime options to disable acpi. Is there a way to silence this error/warning? Cheers, Florian _______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users_______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users
_______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users