On 9/4/21 4:55 PM, Carsten Haitzler wrote:
On Sat, 4 Sep 2021 11:47:20 +0900 Florian Schaefer <list...@netego.de> said:

Raster,

Thanks for the quick reply and help!

OK, so ibox seems to be the culprit. With the module unloaded I was not
able to crash the system. That's quite interesting, on my personal
machine I am using ibox ever since and never had any issues (just like
your test yesterday). So this seems to be somehow specific to my new
system here.

Anyway, thanks for pointing me into the right direction. With this I now
also finally understood how to identify which one of the many threads
was the segfaulting one. ;-)

Now for the backtrace. As it is quite short I will paste it below

========================================
(gdb) bt
#0  0x00007f23b417f872 in __libc_pause () at
../sysdeps/unix/sysv/linux/pause.c:29
#1  0x0000564440d159f7 in e_alert_show () at ../src/bin/e_alert.c:43
#2  0x0000564440cda47a in _e_crash () at ../src/bin/e_signals.c:81
#3  0x0000564440cda4a9 in e_sigseg_act (x=<optimized out>,
info=<optimized out>, data=<optimized out>) at ../src/bin/e_signals.c:91
#4  0x00007f23b4180140 in <signal handler called> () at
/lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
../src/modules/ibox/e_mod_main.c:636
#6  0x00007f23a57df330 in _ibox_cb_icon_fill_timer (data=<optimized
out>) at ../src/modules/ibox/e_mod_main.c:526
#7  0x00007f23b4c25581 in _ecore_call_task_cb (data=<optimized out>,
func=<optimized out>) at ../src/lib/ecore/ecore_private.h:456
#8  _ecore_timer_legacy_tick (data=0x564441cbf230, event=0x7ffd43c61150)
at ../src/lib/ecore/ecore_timer.c:172
#9  0x00007f23b3b1c130 in _event_callback_call (obj_id=0x400000379067,
pd=0x5644412371e0, desc=0x7f23b4c521e0
<_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=<optimized out>,
legacy_compare=legacy_compare@entry=0 '\000') at
../src/lib/eo/eo_base_class.c:2114
#10 0x00007f23b3b1c3ec in _efl_object_event_callback_call
(obj_id=<optimized out>, pd=<optimized out>, desc=<optimized out>,
event_info=<optimized out>) at ../src/lib/eo/eo_base_class.c:2186
#11 0x00007f23b3b16620 in efl_event_callback_call (obj=<optimized out>,
desc=desc@entry=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>,
event_info=event_info@entry=0x0) at ../src/lib/eo/eo_base_class.c:2189
#12 0x00007f23b4c26e15 in _efl_loop_timer_expired_call
(obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
when=when@entry=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:669
#13 0x00007f23b4c26f43 in _efl_loop_timer_expired_timers_call
(obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
when=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:621
#14 0x00007f23b4bf2fae in _ecore_main_loop_iterate_internal
(obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
once_only=once_only@entry=0) at ../src/lib/ecore/ecore_main.c:2431
#15 0x00007f23b4bf383f in _ecore_main_loop_begin
(obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460) at
../src/lib/ecore/ecore_main.c:1231
#16 0x00007f23b4bf7e6d in _efl_loop_begin (obj=0x40000000012d,
pd=0x5644411fd460) at ../src/lib/ecore/efl_loop.c:57
#17 0x00007f23b4bf7233 in efl_loop_begin (obj=0x40000000012d) at
src/lib/ecore/efl_loop.eo.c:28
#18 0x00007f23b4bf390c in ecore_main_loop_begin () at
../src/lib/ecore/ecore_main.c:1316
#19 0x0000564440cb8c50 in main (argc=<optimized out>, argv=<optimized
out>) at ../src/bin/e_main.c:1121

(gdb) fr 5
#5  0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
../src/modules/ibox/e_mod_main.c:636
636        if ((ic->ibox->inst->ci->show_preview) &&
(edje_object_part_exists(ic->o_holder, "e.swallow.preview")))

(gdb) list
631     }
632
633     static void
634     _ibox_icon_fill(IBox_Icon *ic)
635     {
636        if ((ic->ibox->inst->ci->show_preview) &&
(edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
637          _ibox_icon_fill_preview(ic, EINA_FALSE);
638        else
639          _ibox_icon_fill_icon(ic);
640

(gdb) print ic
$1 = (IBox_Icon *) 0x5644419a2910

(gdb) print *ic
$2 = {ibox = 0x564441cc3fe0, o_holder = 0x0, o_icon = 0x0, o_holder2 =
0x0, o_icon2 = 0x0, client = 0x0, drag = {start = 0 '\000', dnd = 0
'\000', x = 0, y = 0, dx = 0, dy = 128}}

(gdb) print *(ic->ibox)
$3 = {inst = 0x40, o_box = 0xe1, o_drop = 0x564441a499b0, o_drop_over =
0x7f23b4165cb0 <main_arena+304>, o_empty = 0x7474756200726162,
ic_drop_before = 0x81646c3698761235, drop_before = 1103904792, icons =
0x0, zone = 0x698761254, dnd_x = 0, dnd_y = 1769170290}

(gdb) print *(ic->ibox->inst)
Cannot access memory at address 0x40
========================================

So somehow we've got some garbage pointer in ic->ibox->inst.

actualluy.. ic->ibox is junk. iut happens to point to some memory we can access
but it's full of ... garbage. like dnd_y is and unrealistic coord. zone does
not look like a proper pointer (o_drop does) and o_box is nothing like what a
pointer should look like. drop_before seems junky too. so ... what happened to
ic->ibox? or ... for that matter what happened to ic? maybe ic has been freed
and now the ibox ptr has been overwritten to point to some junk as i cant
imagine the ibox struct being freed as that struct is still there for the ibox
gadget. so ...

Ah I see. It certainly makes debugging easier if you know what a pointer is supposed to look like. :-)

well turning on ASAN (search enlightenment.org for asan and how to enable it)
in efl and e would probably instantly point out the problem. you can try that
as an exercise in  being able to divine better debug info from efl + e. it's
pretty easy now with meson.... :) unlike valgrind it's not prohibitively slow
either. it's usable day to day on a fast enough machine.

Interesting. Thanks for the pointer to new debugging tools. (And yes, valgrind is really slow.) I found the documentation you mentioned. I think I will give it a try before applying your patch, just to see what happens and to be able to play around with it for a bit.

and i can see the problem:

    ecore_timer_add(0.1, _ibox_cb_icon_fill_timer, ic);

a timer is created to fill the icon in 0.1 sec... but ... imagine the icon (ic)
has been freed/deleted BEFORE the timer fires... in 0.1sec from now. ...
someone added a timer without remembering to delete it when the icon the timer
is for is deleted! a bit sloppy...

Bad boy. ;-)

This means that on my old laptop I never ran into any issues because it is just too slow for this race condition to occur?

d12acf0d01e628d71548adbb77670c7e40aef043 commit in git now fixes that. problem
is in e ... not efl :)

Great. Thanks! As said before, I will try to tackle this with ASAN first for training and then see how your solution is holding up. That will hopefully be tomorrow.

Now to the second point of my first mail from yesterday: Is there any way for me to disable/silence the error popup on startup that no ACPI daemon is running?

Cheers,
Florian

I tried to poke into the preceding frames (#6 and #7) but only hit
optimized out variables. This is efl territory, right? This morning I
recompiled enlightenment with "-O0 -g" but I guess I should also have
done the same to efl. Well, I can do this the next time I'm in office if
helpful.

Any ideas?

For now I gave ibar a try. Not exactly a replacement for me. I don't
need a launcher (using everything and favorites menu instead) or a
tracker of running windows (I know what windows I have open). I only
need something to show my minimized windows so that I can open them
again (I know, they appear with Alt+Tab...) and this seems to be the
only scenario that cannot be reproduced by ibar. -- I guess I never
bought into the MacOS style launcher bar. ;-)

ibar will show both running and minimized icons for windows .. but ok - yeah -
it doesnt "show only minimized"... :)

Cheers
Florian

On 9/4/21 1:25 AM, Carsten Haitzler wrote:
On Fri, 3 Sep 2021 21:04:35 +0900 Florian Schaefer <list...@netego.de> said:

quick - if you unload the ibox module ... does the problem stop? that crash
is inside ibox code - memory it's accessing is bad/wrong - why i don't
know. not more information. like 363 in ibox is:

     if ((ic->ibox->inst->ci->show_preview) &&
(edje_object_part_exists(ic->o_holder, "e.swallow.preview")))

so what is ic? whats is ic->ibox, ic->ibox->inst, ic->ibox->inst->ci ?

if you attach gdb when e crashes and dump these values - i'd know more.
maybe. I actually stopped using ibox a while ago since ibar does both
effectively these days. perhaps it is an ibox bug and i havent seen it as i
dont use it. so try the above, if it goes away - attach gdb

i can say that i dont see the problem here with ibox enabled and on amd + e
(git).

Dear everyone,

so I got a new desktop PC at work and the first thing I did, of course,
was to install Debian sid and enlightenment-git. ;-)

The machine has a Nvidia T600 card and this is where troubles probably
begin. As I kind of need the graphics performance for CAD I went with
the drivers from Nvidia (the stock open source drivers were terribly slow).

Now what happens is that enlightenment crashes often. Like kind of
constantly. I got the impression it happens mostly when several windows
are going through their appearance fade-in transition at the same time.
Then the "red screen of death" appears and I need to press F1 to
continue. With some applications this happens always (Eagle anyone?)
with others only sometimes. After the forced restart many windows (e.g.
terminology always, firefox sometimes) need to be minimized and
uncovered again for their content to display again. Some dialog windows
won't even show their content from the beginning and instead just some
different portion of the screen. Needless to say that for a machine at
work this is not an optimal situation.

The most pressing issue are of course the crashes. I recompiled
everything with debugging symbols and optimization disabled (or at least
I thought so, some things seem still to be optimized away) to get some
meaningful dumps. One of which I uploaded to pastebin
(https://pastebin.com/YWSarC10) hoping that it makes sense to someone.

I am sure that it is not E that is "at fault" but Nvidia, but for now I
need to find a way around this so that I can work without having to
reset everything every five minutes. Any ideas?

Oh, I also tried to disable OpenGL in the compositor settings and
choosing the software option. And it still crashes!

For starters I was hoping that I can just switch off all the window
transition-fading eye-candy but I did not understand whether this is
possible. Is it?

Finally, being a desktop system (my first in like 10 years or so) it
does not run an acpi daemon. I don't really see any reason to do so.
Therefore E also complains on every startup that no acpi daemon can be
found. I did not find any compile time or runtime options to disable
acpi. Is there a way to silence this error/warning?

Cheers,
Florian


_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users





_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users





_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to