Summoned On Sun, 5 Sep 2021, 14:12 Carsten Haitzler, <ras...@rasterman.com> wrote:
> On Sun, 5 Sep 2021 11:25:35 +0900 Florian Schaefer <list...@netego.de> > said: > > > OK, the ibox patch seemed to resolve this issue. > > Thank you very much! :-) > > > > But. As you proposed I started to play with ASAN ... and opened quite a > > can of worms apparently. E is now rather constantly crashing. I guess > > this is because of the "abort_on_error=1" setting of ASAN and it's, > > well, finding many memory leaks. So I hope we can squash them one by one. > > export > > ASAN_OPTIONS="detect_odr_violation=0:detect_leaks=0:abort_on_error=1:new_delete_type_mismatch=0" > > :) it will only barf on real memory erros - not smaller things that don't > cause > crashes. for leaks i'm more interested in using massif for that, but they > wont > cause crashes so those are "worry about another day" if anything. > > > First I want to say that I needed to add "log_path=asan.log" to the > > ASAN_OPTIONS variable in order to have the asan output actually written > > somewhere, so I would propose to add this information to the > > enlightenment homepage. Most users nowadys probably don't start E from a > > terminal where any stdout would be visible. > > actually i just redirect ALL stdout/err from e to ~/.xsession-errors so > that > handles it anyway :) you won't need to do the above special asan log if > you're > dloing that and i'd generally say it's a smart move. if you don't you can > also > check your journald logs from systemd etc. > > > So I tried to capture one of the crashes as best as I could with both > > gdb and asan. This one seemed to be in the procstats module. The result > > is here: https://pastebin.com/M6V2QTwd > > ooh procstats... i do not run that, so that probably explains why i don't > see > this... > > /me summons a netstar > > > > > Also, now E brings an additional error popup when returning from the > > lock screen: "Authentication via PAM had errors setting up the > > authentication session. The error code was 6." This did not happen > > before the recompiling. So I was suspecting that this is somehow due to > > ASAN so I tried to remove the ASAN_OPTIONS from the .xsessionrc. But it > > seems that without this variable E won't even start now. I see the > > processes in the process list but the screen remains just black. > > Therefore back to ASAN it is. Also I could not find any related messages > > in auth.log or similar. Very strange and somewhat unsettling. > > aaaah yes. i think error code is changing because asan detects something > e.g. > like a leak on shutdown of the ckpasswd slave binary thus making this not > work. > basically "don't rely on desklock to work right" if using asan. kind of a > "gotcha". > > > Concerning the ACPI daemon. I see, this seems to be a "hard" requirement > > of E then. Interesting design choice. For me personally running an ACPI > > It's a soft requirement. E works without BUT you will be missing events for > things like: lid open/close, some power/reset buttons being pressed, ac > adaptor > plug/unplug ... e will check if your system has acpi at all - if it does it > will want events from acpid to handle these. it may be you are lucky and > don't > need these (eg only have a power button - you already getkey press for it > and > no reset button, no lid, no ac adapter/battery), but e will basically > insist > this runs because you have these as possible events. it's a trivially small > daemon to run and every distro i know of has it, so not much to just go do > this. i added this because people complained e didn't suspend their laptop > on > lid close and it ended up they didn't follow the recommendation of having > acpid > to handle that. this is there because people don't follow docs so now it's > pushing it on everyone to avoid things like a laptop in your backpack > running > and overheating and running your entire battery empty in a few hours. > > > daemon on a desktop system has exactly zero additional benefit. The > > power button is handled by systemd just fine and I am happy for every > > actually it's not. e inhibits systemd handling this - always. no choice. > when e > runs system will ignore this. e is handling it. it can handle it either > via a > x11 power key press event OR an acpi button press. see above. :) > > > unnecessary daemon that I can prevent from cluttering my ps output. So, > > anyway, for now I just commented out the callback to the popup. Works > > great. ;-) > > see above. too many times people don't follow the recommendations, so now > forcing it on everyone. i have considered adding acpi support to > enlightenment_system that runs as root, but i haven't done that so until > then ... you need acpid. :) > > > Cheers > > Florian > > > > On 9/5/21 6:27 AM, Carsten Haitzler wrote: > > > On Sat, 4 Sep 2021 17:52:09 +0900 Florian Schaefer <list...@netego.de> > said: > > > > > >> On 9/4/21 4:55 PM, Carsten Haitzler wrote: > > >>> On Sat, 4 Sep 2021 11:47:20 +0900 Florian Schaefer < > list...@netego.de> > > >>> said: > > >>> > > >>>> Raster, > > >>>> > > >>>> Thanks for the quick reply and help! > > >>>> > > >>>> OK, so ibox seems to be the culprit. With the module unloaded I was > not > > >>>> able to crash the system. That's quite interesting, on my personal > > >>>> machine I am using ibox ever since and never had any issues (just > like > > >>>> your test yesterday). So this seems to be somehow specific to my new > > >>>> system here. > > >>>> > > >>>> Anyway, thanks for pointing me into the right direction. With this > I now > > >>>> also finally understood how to identify which one of the many > threads > > >>>> was the segfaulting one. ;-) > > >>>> > > >>>> Now for the backtrace. As it is quite short I will paste it below > > >>>> > > >>>> ======================================== > > >>>> (gdb) bt > > >>>> #0 0x00007f23b417f872 in __libc_pause () at > > >>>> ../sysdeps/unix/sysv/linux/pause.c:29 > > >>>> #1 0x0000564440d159f7 in e_alert_show () at ../src/bin/e_alert.c:43 > > >>>> #2 0x0000564440cda47a in _e_crash () at ../src/bin/e_signals.c:81 > > >>>> #3 0x0000564440cda4a9 in e_sigseg_act (x=<optimized out>, > > >>>> info=<optimized out>, data=<optimized out>) at > ../src/bin/e_signals.c:91 > > >>>> #4 0x00007f23b4180140 in <signal handler called> () at > > >>>> /lib/x86_64-linux-gnu/libpthread.so.0 > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at > > >>>> ../src/modules/ibox/e_mod_main.c:636 > > >>>> #6 0x00007f23a57df330 in _ibox_cb_icon_fill_timer (data=<optimized > > >>>> out>) at ../src/modules/ibox/e_mod_main.c:526 > > >>>> #7 0x00007f23b4c25581 in _ecore_call_task_cb (data=<optimized out>, > > >>>> func=<optimized out>) at ../src/lib/ecore/ecore_private.h:456 > > >>>> #8 _ecore_timer_legacy_tick (data=0x564441cbf230, > event=0x7ffd43c61150) > > >>>> at ../src/lib/ecore/ecore_timer.c:172 > > >>>> #9 0x00007f23b3b1c130 in _event_callback_call > (obj_id=0x400000379067, > > >>>> pd=0x5644412371e0, desc=0x7f23b4c521e0 > > >>>> <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=<optimized out>, > > >>>> legacy_compare=legacy_compare@entry=0 '\000') at > > >>>> ../src/lib/eo/eo_base_class.c:2114 > > >>>> #10 0x00007f23b3b1c3ec in _efl_object_event_callback_call > > >>>> (obj_id=<optimized out>, pd=<optimized out>, desc=<optimized out>, > > >>>> event_info=<optimized out>) at ../src/lib/eo/eo_base_class.c:2186 > > >>>> #11 0x00007f23b3b16620 in efl_event_callback_call (obj=<optimized > out>, > > >>>> desc=desc@entry=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, > > >>>> event_info=event_info@entry=0x0) at > ../src/lib/eo/eo_base_class.c:2189 > > >>>> #12 0x00007f23b4c26e15 in _efl_loop_timer_expired_call > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, > > >>>> when=when@entry=436613.23437423998) at > ../src/lib/ecore/ecore_timer.c:669 > > >>>> #13 0x00007f23b4c26f43 in _efl_loop_timer_expired_timers_call > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, > > >>>> when=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:621 > > >>>> #14 0x00007f23b4bf2fae in _ecore_main_loop_iterate_internal > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460, > > >>>> once_only=once_only@entry=0) at ../src/lib/ecore/ecore_main.c:2431 > > >>>> #15 0x00007f23b4bf383f in _ecore_main_loop_begin > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460) at > > >>>> ../src/lib/ecore/ecore_main.c:1231 > > >>>> #16 0x00007f23b4bf7e6d in _efl_loop_begin (obj=0x40000000012d, > > >>>> pd=0x5644411fd460) at ../src/lib/ecore/efl_loop.c:57 > > >>>> #17 0x00007f23b4bf7233 in efl_loop_begin (obj=0x40000000012d) at > > >>>> src/lib/ecore/efl_loop.eo.c:28 > > >>>> #18 0x00007f23b4bf390c in ecore_main_loop_begin () at > > >>>> ../src/lib/ecore/ecore_main.c:1316 > > >>>> #19 0x0000564440cb8c50 in main (argc=<optimized out>, > argv=<optimized > > >>>> out>) at ../src/bin/e_main.c:1121 > > >>>> > > >>>> (gdb) fr 5 > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at > > >>>> ../src/modules/ibox/e_mod_main.c:636 > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) && > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) > > >>>> > > >>>> (gdb) list > > >>>> 631 } > > >>>> 632 > > >>>> 633 static void > > >>>> 634 _ibox_icon_fill(IBox_Icon *ic) > > >>>> 635 { > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) && > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) > > >>>> 637 _ibox_icon_fill_preview(ic, EINA_FALSE); > > >>>> 638 else > > >>>> 639 _ibox_icon_fill_icon(ic); > > >>>> 640 > > >>>> > > >>>> (gdb) print ic > > >>>> $1 = (IBox_Icon *) 0x5644419a2910 > > >>>> > > >>>> (gdb) print *ic > > >>>> $2 = {ibox = 0x564441cc3fe0, o_holder = 0x0, o_icon = 0x0, > o_holder2 = > > >>>> 0x0, o_icon2 = 0x0, client = 0x0, drag = {start = 0 '\000', dnd = 0 > > >>>> '\000', x = 0, y = 0, dx = 0, dy = 128}} > > >>>> > > >>>> (gdb) print *(ic->ibox) > > >>>> $3 = {inst = 0x40, o_box = 0xe1, o_drop = 0x564441a499b0, > o_drop_over = > > >>>> 0x7f23b4165cb0 <main_arena+304>, o_empty = 0x7474756200726162, > > >>>> ic_drop_before = 0x81646c3698761235, drop_before = 1103904792, > icons = > > >>>> 0x0, zone = 0x698761254, dnd_x = 0, dnd_y = 1769170290} > > >>>> > > >>>> (gdb) print *(ic->ibox->inst) > > >>>> Cannot access memory at address 0x40 > > >>>> ======================================== > > >>>> > > >>>> So somehow we've got some garbage pointer in ic->ibox->inst. > > >>> > > >>> actualluy.. ic->ibox is junk. iut happens to point to some memory we > can > > >>> access but it's full of ... garbage. like dnd_y is and unrealistic > coord. > > >>> zone does not look like a proper pointer (o_drop does) and o_box is > > >>> nothing like what a pointer should look like. drop_before seems junky > > >>> too. so ... what happened to ic->ibox? or ... for that matter what > > >>> happened to ic? maybe ic has been freed and now the ibox ptr has been > > >>> overwritten to point to some junk as i cant imagine the ibox struct > being > > >>> freed as that struct is still there for the ibox gadget. so ... > > >> > > >> Ah I see. It certainly makes debugging easier if you know what a > pointer > > >> is supposed to look like. :-) > > >> > > >>> well turning on ASAN (search enlightenment.org for asan and how to > enable > > >>> it) in efl and e would probably instantly point out the problem. you > can > > >>> try that as an exercise in being able to divine better debug info > from > > >>> efl > > >>> + e. it's pretty easy now with meson.... :) unlike valgrind it's not > > >>> prohibitively slow either. it's usable day to day on a fast enough > > >>> machine. > > >> > > >> Interesting. Thanks for the pointer to new debugging tools. (And yes, > > >> valgrind is really slow.) I found the documentation you mentioned. I > > >> think I will give it a try before applying your patch, just to see > what > > >> happens and to be able to play around with it for a bit. > > >> > > >>> and i can see the problem: > > >>> > > >>> ecore_timer_add(0.1, _ibox_cb_icon_fill_timer, ic); > > >>> > > >>> a timer is created to fill the icon in 0.1 sec... but ... imagine > the icon > > >>> (ic) has been freed/deleted BEFORE the timer fires... in 0.1sec from > > >>> now. ... someone added a timer without remembering to delete it when > the > > >>> icon the timer is for is deleted! a bit sloppy... > > >> > > >> Bad boy. ;-) > > >> > > >> This means that on my old laptop I never ran into any issues because > it > > >> is just too slow for this race condition to occur? > > >> > > >>> d12acf0d01e628d71548adbb77670c7e40aef043 commit in git now fixes > that. > > >>> problem is in e ... not efl :) > > >> > > >> Great. Thanks! As said before, I will try to tackle this with ASAN > first > > >> for training and then see how your solution is holding up. That will > > >> hopefully be tomorrow. > > >> > > >> Now to the second point of my first mail from yesterday: Is there any > > >> way for me to disable/silence the error popup on startup that no ACPI > > >> daemon is running? > > > > > > oh yes. install acpid and have it run. :) > > > > > >> Cheers, > > >> Florian > > >> > > >>>> I tried to poke into the preceding frames (#6 and #7) but only hit > > >>>> optimized out variables. This is efl territory, right? This morning > I > > >>>> recompiled enlightenment with "-O0 -g" but I guess I should also > have > > >>>> done the same to efl. Well, I can do this the next time I'm in > office if > > >>>> helpful. > > >>>> > > >>>> Any ideas? > > >>>> > > >>>> For now I gave ibar a try. Not exactly a replacement for me. I don't > > >>>> need a launcher (using everything and favorites menu instead) or a > > >>>> tracker of running windows (I know what windows I have open). I only > > >>>> need something to show my minimized windows so that I can open them > > >>>> again (I know, they appear with Alt+Tab...) and this seems to be the > > >>>> only scenario that cannot be reproduced by ibar. -- I guess I never > > >>>> bought into the MacOS style launcher bar. ;-) > > >>> > > >>> ibar will show both running and minimized icons for windows .. but > ok - > > >>> yeah - it doesnt "show only minimized"... :) > > >>> > > >>>> Cheers > > >>>> Florian > > >>>> > > >>>> On 9/4/21 1:25 AM, Carsten Haitzler wrote: > > >>>>> On Fri, 3 Sep 2021 21:04:35 +0900 Florian Schaefer < > list...@netego.de> > > >>>>> said: > > >>>>> > > >>>>> quick - if you unload the ibox module ... does the problem stop? > that > > >>>>> crash is inside ibox code - memory it's accessing is bad/wrong - > why i > > >>>>> don't know. not more information. like 363 in ibox is: > > >>>>> > > >>>>> if ((ic->ibox->inst->ci->show_preview) && > > >>>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview"))) > > >>>>> > > >>>>> so what is ic? whats is ic->ibox, ic->ibox->inst, > ic->ibox->inst->ci ? > > >>>>> > > >>>>> if you attach gdb when e crashes and dump these values - i'd know > more. > > >>>>> maybe. I actually stopped using ibox a while ago since ibar does > both > > >>>>> effectively these days. perhaps it is an ibox bug and i havent > seen it > > >>>>> as i dont use it. so try the above, if it goes away - attach gdb > > >>>>> > > >>>>> i can say that i dont see the problem here with ibox enabled and > on amd > > >>>>> + e (git). > > >>>>> > > >>>>>> Dear everyone, > > >>>>>> > > >>>>>> so I got a new desktop PC at work and the first thing I did, of > course, > > >>>>>> was to install Debian sid and enlightenment-git. ;-) > > >>>>>> > > >>>>>> The machine has a Nvidia T600 card and this is where troubles > probably > > >>>>>> begin. As I kind of need the graphics performance for CAD I went > with > > >>>>>> the drivers from Nvidia (the stock open source drivers were > terribly > > >>>>>> slow). > > >>>>>> > > >>>>>> Now what happens is that enlightenment crashes often. Like kind of > > >>>>>> constantly. I got the impression it happens mostly when several > windows > > >>>>>> are going through their appearance fade-in transition at the same > time. > > >>>>>> Then the "red screen of death" appears and I need to press F1 to > > >>>>>> continue. With some applications this happens always (Eagle > anyone?) > > >>>>>> with others only sometimes. After the forced restart many windows > (e.g. > > >>>>>> terminology always, firefox sometimes) need to be minimized and > > >>>>>> uncovered again for their content to display again. Some dialog > windows > > >>>>>> won't even show their content from the beginning and instead just > some > > >>>>>> different portion of the screen. Needless to say that for a > machine at > > >>>>>> work this is not an optimal situation. > > >>>>>> > > >>>>>> The most pressing issue are of course the crashes. I recompiled > > >>>>>> everything with debugging symbols and optimization disabled (or at > > >>>>>> least I thought so, some things seem still to be optimized away) > to > > >>>>>> get some meaningful dumps. One of which I uploaded to pastebin > > >>>>>> (https://pastebin.com/YWSarC10) hoping that it makes sense to > someone. > > >>>>>> > > >>>>>> I am sure that it is not E that is "at fault" but Nvidia, but for > now I > > >>>>>> need to find a way around this so that I can work without having > to > > >>>>>> reset everything every five minutes. Any ideas? > > >>>>>> > > >>>>>> Oh, I also tried to disable OpenGL in the compositor settings and > > >>>>>> choosing the software option. And it still crashes! > > >>>>>> > > >>>>>> For starters I was hoping that I can just switch off all the > window > > >>>>>> transition-fading eye-candy but I did not understand whether this > is > > >>>>>> possible. Is it? > > >>>>>> > > >>>>>> Finally, being a desktop system (my first in like 10 years or so) > it > > >>>>>> does not run an acpi daemon. I don't really see any reason to do > so. > > >>>>>> Therefore E also complains on every startup that no acpi daemon > can be > > >>>>>> found. I did not find any compile time or runtime options to > disable > > >>>>>> acpi. Is there a way to silence this error/warning? > > >>>>>> > > >>>>>> Cheers, > > >>>>>> Florian > > >>>>>> > > >>>>>> > > >>>>>> _______________________________________________ > > >>>>>> enlightenment-users mailing list > > >>>>>> enlightenment-users@lists.sourceforge.net > > >>>>>> https://lists.sourceforge.net/lists/listinfo/enlightenment-users > > >>>>>> > > >>>>> > > >>>>> > > >>>> > > >>>> > > >>>> _______________________________________________ > > >>>> enlightenment-users mailing list > > >>>> enlightenment-users@lists.sourceforge.net > > >>>> https://lists.sourceforge.net/lists/listinfo/enlightenment-users > > >>>> > > >>> > > >>> > > >> > > >> > > >> _______________________________________________ > > >> enlightenment-users mailing list > > >> enlightenment-users@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/enlightenment-users > > >> > > > > > > > > > > > > _______________________________________________ > > enlightenment-users mailing list > > enlightenment-users@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/enlightenment-users > > > > > -- > ------------- Codito, ergo sum - "I code, therefore I am" -------------- > Carsten Haitzler - ras...@rasterman.com > > > > _______________________________________________ > enlightenment-users mailing list > enlightenment-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/enlightenment-users > _______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users