On Sun, 05 Sep 2021 23:49:49 +0000 Florian Schaefer <list...@netego.de> said:

> "Carsten Haitzler" ras...@rasterman.com – September 5, 2021 10:11 PM
> > On Sun, 5 Sep 2021 11:25:35 +0900 Florian Schaefer <list...@netego.de> said:
> >
> > > OK, the ibox patch seemed to resolve this issue.
> > > Thank you very much! :-)
> > >
> > > But. As you proposed I started to play with ASAN ... and opened quite a
> > > can of worms apparently. E is now rather constantly crashing. I guess
> > > this is because of the "abort_on_error=1" setting of ASAN and it's,
> > > well, finding many memory leaks. So I hope we can squash them one by one.
> >
> > export
> > ASAN_OPTIONS="detect_odr_violation=0:detect_leaks=0:abort_on_error=1:new_delete_type_mismatch=0"
> >
> > :) it will only barf on real memory erros - not smaller things that don't
> > cause crashes. for leaks i'm more interested in using massif for that, but
> > they wont cause crashes so those are "worry about another day" if anything.
> 
> These are actually exactly the options I am using right now. So I guess the
> errors are deemed sufficiently "real" to merit a SIGSEGV. Really, right now
> my system has become a little bit stressful to use. I hope we manage to
> settle this down soon. :-)

then it'll be real errors - i looked at the traces for the e util dialog thing
- i have i believe fixed those now in git. netstar i believe fixed procstats.
but those asan env var options will not report leaks as issues. :)

> > > First I want to say that I needed to add "log_path=asan.log" to the
> > > ASAN_OPTIONS variable in order to have the asan output actually written
> > > somewhere, so I would propose to add this information to the
> > > enlightenment homepage. Most users nowadys probably don't start E from a
> > > terminal where any stdout would be visible.
> >
> > actually i just redirect ALL stdout/err from e to ~/.xsession-errors so that
> > handles it anyway :) you won't need to do the above special asan log if
> > you're dloing that and i'd generally say it's a smart move. if you don't
> > you can also check your journald logs from systemd etc.
> 
> Thanks. Interestingly no .xsession-errors is created here even though it is
> at least mentioned in /etc/X11/Xsession. I had a look at my journalctrl
> output. Even though I see tons of other errors (actually also a few
> backtraces for eina_btlog) there are no traces of ASAN. And this morning ASAN
> even stopped giving the reports to my logging file. What about the good old
> days where everything was easy to follow? :)
> 
> > > So I tried to capture one of the crashes as best as I could with both
> > > gdb and asan. This one seemed to be in the procstats module. The result
> > > is here: pastebin.com/M6V2QTwd
> >
> > ooh procstats... i do not run that, so that probably explains why i don't
> > see this...
> >
> > /me summons a netstar
> 
> I see that Al has addressed this by now. Thank you very much! I will give it
> a try.
> 
> > > Also, now E brings an additional error popup when returning from the
> > > lock screen: "Authentication via PAM had errors setting up the
> > > authentication session. The error code was 6." This did not happen
> > > before the recompiling. So I was suspecting that this is somehow due to
> > > ASAN so I tried to remove the ASAN_OPTIONS from the .xsessionrc. But it
> > > seems that without this variable E won't even start now. I see the
> > > processes in the process list but the screen remains just black.
> > > Therefore back to ASAN it is. Also I could not find any related messages
> > > in auth.log or similar. Very strange and somewhat unsettling.
> >
> > aaaah yes. i think error code is changing because asan detects something
> > e.g. like a leak on shutdown of the ckpasswd slave binary thus making this
> > not work. basically "don't rely on desklock to work right" if using asan.
> > kind of a "gotcha".
> 
> OK, thank you for clearing this up. Then I am happy that your apparent
> fallbacks for the unlocking at least leave me back into my system.

yeah. some in security would call this bad - a sw failure will unlock the
machine. but on the flip side that sw failure can lock you (the user) out, so
i've gone for fixing the last case and at least let you in until the issue is
fixed. in this case it's just until you stop building with asan :)

> > > Concerning the ACPI daemon. I see, this seems to be a "hard" requirement
> > > of E then. Interesting design choice. For me personally running an ACPI
> >
> > It's a soft requirement. E works without BUT you will be missing events for
> > things like: lid open/close, some power/reset buttons being pressed, ac
> > adaptor plug/unplug ... e will check if your system has acpi at all - if it
> > does it will want events from acpid to handle these. it may be you are
> > lucky and don't need these (eg only have a power button - you already
> > getkey press for it and no reset button, no lid, no ac adapter/battery),
> > but e will basically insist this runs because you have these as possible
> > events. it's a trivially small daemon to run and every distro i know of has
> > it, so not much to just go do this. i added this because people complained
> > e didn't suspend their laptop on lid close and it ended up they didn't
> > follow the recommendation of having acpid to handle that. this is there
> > because people don't follow docs so now it's pushing it on everyone to
> > avoid things like a laptop in your backpack running and overheating and
> > running your entire battery empty in a few hours.
> 
> For a mobile computer I completely understand this. It is just that a desktop
> PC has neither lid nor (removable) AC adapter so I didn't really see the need
> for an acpi handler. And surprisingly debian didn't install one by default
> (even though installing thousands of other daemons for GNOME....).

they have gone to need all their own daemons like upower and so on to do a lot
of this which we don't need. :) we have needed acpid for the last 15 years or
so (as long as e17 has been under dev  and well since i added acpi support
probably early-ish in dev). :) as i said - you CAN have a reset button on some
desktop pc's. some desktop pc's might be semi-portables with a battery. for
example:

https://www.amazon.co.uk/17-3-Ultra-Slim-Desktop-Battery/dp/B07FVVHSVW

:) it's a continuum these days. you could cal that a laptop without a keyboard
built in. you could call it a tablet. or a desktop. from e's point of view it
checks and it has acpi support (/proc/acpi exists) and thus ... e wants acpid
if the system can support acpi at all. :)

> > > daemon on a desktop system has exactly zero additional benefit. The
> > > power button is handled by systemd just fine and I am happy for every
> >
> > actually it's not. e inhibits systemd handling this - always. no choice.
> > when e runs system will ignore this. e is handling it. it can handle it
> > either via a x11 power key press event OR an acpi button press. see
> > above. :)
> 
> Thanks. I was actually not aware that E can still intercept the power button.
> Indeed, pressing the power button just give the popup menu from E.
> Interesting. Well, I usually don't use the power button anyway. Much to
> awkward to reach in my current setup. ;)

:) e can get this 2 ways - if x exposes the power button key OR an acpi event.
it may get both. it may only get one. but this is one of those reasons for
acpid. :)

> > > unnecessary daemon that I can prevent from cluttering my ps output. So,
> > > anyway, for now I just commented out the callback to the popup. Works
> > > great. ;-)
> >
> > see above. too many times people don't follow the recommendations, so now
> > forcing it on everyone. i have considered adding acpi support to
> > enlightenment_system that runs as root, but i haven't done that so until
> > then ... you need acpid. :)
> 
> OK, no worries. I will just leave the error popup commented and see how far I
> can get just ignoring this issue for now.
> 
> Cheers,
> Florian

or run acpid which will be less hassle than maintaining a patch :)

> > > On 9/5/21 6:27 AM, Carsten Haitzler wrote:
> > > > On Sat, 4 Sep 2021 17:52:09 +0900 Florian Schaefer <list...@netego.de>
> > > > said:
> > > >
> > > >> On 9/4/21 4:55 PM, Carsten Haitzler wrote:
> > > >>> On Sat, 4 Sep 2021 11:47:20 +0900 Florian Schaefer <list...@netego.de>
> > > >>> said:
> > > >>>
> > > >>>> Raster,
> > > >>>>
> > > >>>> Thanks for the quick reply and help!
> > > >>>>
> > > >>>> OK, so ibox seems to be the culprit. With the module unloaded I was
> > > >>>> not able to crash the system. That's quite interesting, on my
> > > >>>> personal machine I am using ibox ever since and never had any issues
> > > >>>> (just like your test yesterday). So this seems to be somehow
> > > >>>> specific to my new system here.
> > > >>>>
> > > >>>> Anyway, thanks for pointing me into the right direction. With this I
> > > >>>> now also finally understood how to identify which one of the many
> > > >>>> threads was the segfaulting one. ;-)
> > > >>>>
> > > >>>> Now for the backtrace. As it is quite short I will paste it below
> > > >>>>
> > > >>>> ========================================
> > > >>>> (gdb) bt
> > > >>>> #0 0x00007f23b417f872 in __libc_pause () at
> > > >>>> ../sysdeps/unix/sysv/linux/pause.c:29
> > > >>>> #1 0x0000564440d159f7 in e_alert_show () at ../src/bin/e_alert.c:43
> > > >>>> #2 0x0000564440cda47a in _e_crash () at ../src/bin/e_signals.c:81
> > > >>>> #3 0x0000564440cda4a9 in e_sigseg_act (x=<optimized out>,
> > > >>>> info=<optimized out>, data=<optimized out>)
> > > >>>> at ../src/bin/e_signals.c:91
> > > >>>> #4 0x00007f23b4180140 in <signal handler called> () at
> > > >>>> /lib/x86_64-linux-gnu/libpthread.so.0
> > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
> > > >>>> ../src/modules/ibox/e_mod_main.c:636
> > > >>>> #6 0x00007f23a57df330 in _ibox_cb_icon_fill_timer (data=<optimized
> > > >>>> out>) at ../src/modules/ibox/e_mod_main.c:526
> > > >>>> #7 0x00007f23b4c25581 in _ecore_call_task_cb (data=<optimized out>,
> > > >>>> func=<optimized out>) at ../src/lib/ecore/ecore_private.h:456
> > > >>>> #8 _ecore_timer_legacy_tick (data=0x564441cbf230,
> > > >>>> #event=0x7ffd43c61150)
> > > >>>> at ../src/lib/ecore/ecore_timer.c:172
> > > >>>> #9 0x00007f23b3b1c130 in _event_callback_call (obj_id=0x400000379067,
> > > >>>> pd=0x5644412371e0, desc=0x7f23b4c521e0
> > > >>>> <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=<optimized out>,
> > > >>>> legacy_compare=legacy_compare@entry=0 '\000') at
> > > >>>> ../src/lib/eo/eo_base_class.c:2114
> > > >>>> #10 0x00007f23b3b1c3ec in _efl_object_event_callback_call
> > > >>>> (obj_id=<optimized out>, pd=<optimized out>, desc=<optimized out>,
> > > >>>> event_info=<optimized out>) at ../src/lib/eo/eo_base_class.c:2186
> > > >>>> #11 0x00007f23b3b16620 in efl_event_callback_call (obj=<optimized
> > > >>>> #out>,
> > > >>>> desc=desc@entry=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>,
> > > >>>> event_info=event_info@entry=0x0) at ../src/lib/eo/eo_base_class.c:
> > > >>>> 2189
> > > >>>> #12 0x00007f23b4c26e15 in _efl_loop_timer_expired_call
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> when=when@entry=436613.23437423998)
> > > >>>> at ../src/lib/ecore/ecore_timer.c:669
> > > >>>> #13 0x00007f23b4c26f43 in _efl_loop_timer_expired_timers_call
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> when=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:621
> > > >>>> #14 0x00007f23b4bf2fae in _ecore_main_loop_iterate_internal
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> once_only=once_only@entry=0) at ../src/lib/ecore/ecore_main.c:2431
> > > >>>> #15 0x00007f23b4bf383f in _ecore_main_loop_begin
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460) at
> > > >>>> ../src/lib/ecore/ecore_main.c:1231
> > > >>>> #16 0x00007f23b4bf7e6d in _efl_loop_begin (obj=0x40000000012d,
> > > >>>> pd=0x5644411fd460) at ../src/lib/ecore/efl_loop.c:57
> > > >>>> #17 0x00007f23b4bf7233 in efl_loop_begin (obj=0x40000000012d) at
> > > >>>> src/lib/ecore/efl_loop.eo.c:28
> > > >>>> #18 0x00007f23b4bf390c in ecore_main_loop_begin () at
> > > >>>> ../src/lib/ecore/ecore_main.c:1316
> > > >>>> #19 0x0000564440cb8c50 in main (argc=<optimized out>, argv=<optimized
> > > >>>> out>) at ../src/bin/e_main.c:1121
> > > >>>>
> > > >>>> (gdb) fr 5
> > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
> > > >>>> ../src/modules/ibox/e_mod_main.c:636
> > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>>
> > > >>>> (gdb) list
> > > >>>> 631 }
> > > >>>> 632
> > > >>>> 633 static void
> > > >>>> 634 _ibox_icon_fill(IBox_Icon *ic)
> > > >>>> 635 {
> > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>> 637 _ibox_icon_fill_preview(ic, EINA_FALSE);
> > > >>>> 638 else
> > > >>>> 639 _ibox_icon_fill_icon(ic);
> > > >>>> 640
> > > >>>>
> > > >>>> (gdb) print ic
> > > >>>> $1 = (IBox_Icon *) 0x5644419a2910
> > > >>>>
> > > >>>> (gdb) print *ic
> > > >>>> $2 = {ibox = 0x564441cc3fe0, o_holder = 0x0, o_icon = 0x0, o_holder2
> > > >>>> = 0x0, o_icon2 = 0x0, client = 0x0, drag = {start = 0 '\000', dnd = 0
> > > >>>> '\000', x = 0, y = 0, dx = 0, dy = 128}}
> > > >>>>
> > > >>>> (gdb) print *(ic->ibox)
> > > >>>> $3 = {inst = 0x40, o_box = 0xe1, o_drop = 0x564441a499b0,
> > > >>>> o_drop_over = 0x7f23b4165cb0 <main_arena+304>, o_empty =
> > > >>>> 0x7474756200726162, ic_drop_before = 0x81646c3698761235, drop_before
> > > >>>> = 1103904792, icons = 0x0, zone = 0x698761254, dnd_x = 0, dnd_y =
> > > >>>> 1769170290}
> > > >>>>
> > > >>>> (gdb) print *(ic->ibox->inst)
> > > >>>> Cannot access memory at address 0x40
> > > >>>> ========================================
> > > >>>>
> > > >>>> So somehow we've got some garbage pointer in ic->ibox->inst.
> > > >>>
> > > >>> actualluy.. ic->ibox is junk. iut happens to point to some memory we
> > > >>> can access but it's full of ... garbage. like dnd_y is and
> > > >>> unrealistic coord. zone does not look like a proper pointer (o_drop
> > > >>> does) and o_box is nothing like what a pointer should look like.
> > > >>> drop_before seems junky too. so ... what happened to ic->ibox? or ...
> > > >>> for that matter what happened to ic? maybe ic has been freed and now
> > > >>> the ibox ptr has been overwritten to point to some junk as i cant
> > > >>> imagine the ibox struct being freed as that struct is still there for
> > > >>> the ibox gadget. so ...
> > > >>
> > > >> Ah I see. It certainly makes debugging easier if you know what a
> > > >> pointer is supposed to look like. :-)
> > > >>
> > > >>> well turning on ASAN (search enlightenment.org for asan and how to
> > > >>> enable it) in efl and e would probably instantly point out the
> > > >>> problem. you can try that as an exercise in being able to divine
> > > >>> better debug info from efl
> > > >>> + e. it's pretty easy now with meson.... :) unlike valgrind it's not
> > > >>> prohibitively slow either. it's usable day to day on a fast enough
> > > >>> machine.
> > > >>
> > > >> Interesting. Thanks for the pointer to new debugging tools. (And yes,
> > > >> valgrind is really slow.) I found the documentation you mentioned. I
> > > >> think I will give it a try before applying your patch, just to see what
> > > >> happens and to be able to play around with it for a bit.
> > > >>
> > > >>> and i can see the problem:
> > > >>>
> > > >>> ecore_timer_add(0.1, _ibox_cb_icon_fill_timer, ic);
> > > >>>
> > > >>> a timer is created to fill the icon in 0.1 sec... but ... imagine the
> > > >>> icon (ic) has been freed/deleted BEFORE the timer fires... in 0.1sec
> > > >>> from now. ... someone added a timer without remembering to delete it
> > > >>> when the icon the timer is for is deleted! a bit sloppy...
> > > >>
> > > >> Bad boy. ;-)
> > > >>
> > > >> This means that on my old laptop I never ran into any issues because it
> > > >> is just too slow for this race condition to occur?
> > > >>
> > > >>> d12acf0d01e628d71548adbb77670c7e40aef043 commit in git now fixes that.
> > > >>> problem is in e ... not efl :)
> > > >>
> > > >> Great. Thanks! As said before, I will try to tackle this with ASAN
> > > >> first for training and then see how your solution is holding up. That
> > > >> will hopefully be tomorrow.
> > > >>
> > > >> Now to the second point of my first mail from yesterday: Is there any
> > > >> way for me to disable/silence the error popup on startup that no ACPI
> > > >> daemon is running?
> > > >
> > > > oh yes. install acpid and have it run. :)
> > > >
> > > >> Cheers,
> > > >> Florian
> > > >>
> > > >>>> I tried to poke into the preceding frames (#6 and #7) but only hit
> > > >>>> optimized out variables. This is efl territory, right? This morning I
> > > >>>> recompiled enlightenment with "-O0 -g" but I guess I should also have
> > > >>>> done the same to efl. Well, I can do this the next time I'm in
> > > >>>> office if helpful.
> > > >>>>
> > > >>>> Any ideas?
> > > >>>>
> > > >>>> For now I gave ibar a try. Not exactly a replacement for me. I don't
> > > >>>> need a launcher (using everything and favorites menu instead) or a
> > > >>>> tracker of running windows (I know what windows I have open). I only
> > > >>>> need something to show my minimized windows so that I can open them
> > > >>>> again (I know, they appear with Alt+Tab...) and this seems to be the
> > > >>>> only scenario that cannot be reproduced by ibar. -- I guess I never
> > > >>>> bought into the MacOS style launcher bar. ;-)
> > > >>>
> > > >>> ibar will show both running and minimized icons for windows .. but ok
> > > >>> - yeah - it doesnt "show only minimized"... :)
> > > >>>
> > > >>>> Cheers
> > > >>>> Florian
> > > >>>>
> > > >>>> On 9/4/21 1:25 AM, Carsten Haitzler wrote:
> > > >>>>> On Fri, 3 Sep 2021 21:04:35 +0900 Florian Schaefer
> > > >>>>> <list...@netego.de> said:
> > > >>>>>
> > > >>>>> quick - if you unload the ibox module ... does the problem stop?
> > > >>>>> that crash is inside ibox code - memory it's accessing is bad/wrong
> > > >>>>> - why i don't know. not more information. like 363 in ibox is:
> > > >>>>>
> > > >>>>> if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>>>
> > > >>>>> so what is ic? whats is ic->ibox, ic->ibox->inst,
> > > >>>>> ic->ibox->inst->ci ?
> > > >>>>>
> > > >>>>> if you attach gdb when e crashes and dump these values - i'd know
> > > >>>>> more. maybe. I actually stopped using ibox a while ago since ibar
> > > >>>>> does both effectively these days. perhaps it is an ibox bug and i
> > > >>>>> havent seen it as i dont use it. so try the above, if it goes away
> > > >>>>> - attach gdb
> > > >>>>>
> > > >>>>> i can say that i dont see the problem here with ibox enabled and on
> > > >>>>> amd
> > > >>>>> + e (git).
> > > >>>>>
> > > >>>>>> Dear everyone,
> > > >>>>>>
> > > >>>>>> so I got a new desktop PC at work and the first thing I did, of
> > > >>>>>> course, was to install Debian sid and enlightenment-git. ;-)
> > > >>>>>>
> > > >>>>>> The machine has a Nvidia T600 card and this is where troubles
> > > >>>>>> probably begin. As I kind of need the graphics performance for CAD
> > > >>>>>> I went with the drivers from Nvidia (the stock open source drivers
> > > >>>>>> were terribly slow).
> > > >>>>>>
> > > >>>>>> Now what happens is that enlightenment crashes often. Like kind of
> > > >>>>>> constantly. I got the impression it happens mostly when several
> > > >>>>>> windows are going through their appearance fade-in transition at
> > > >>>>>> the same time. Then the "red screen of death" appears and I need
> > > >>>>>> to press F1 to continue. With some applications this happens
> > > >>>>>> always (Eagle anyone?) with others only sometimes. After the
> > > >>>>>> forced restart many windows (e.g. terminology always, firefox
> > > >>>>>> sometimes) need to be minimized and uncovered again for their
> > > >>>>>> content to display again. Some dialog windows won't even show
> > > >>>>>> their content from the beginning and instead just some different
> > > >>>>>> portion of the screen. Needless to say that for a machine at work
> > > >>>>>> this is not an optimal situation.
> > > >>>>>>
> > > >>>>>> The most pressing issue are of course the crashes. I recompiled
> > > >>>>>> everything with debugging symbols and optimization disabled (or at
> > > >>>>>> least I thought so, some things seem still to be optimized away) to
> > > >>>>>> get some meaningful dumps. One of which I uploaded to pastebin
> > > >>>>>> (pastebin.com/YWSarC10) hoping that it makes sense to someone.
> > > >>>>>>
> > > >>>>>> I am sure that it is not E that is "at fault" but Nvidia, but for
> > > >>>>>> now I need to find a way around this so that I can work without
> > > >>>>>> having to reset everything every five minutes. Any ideas?
> > > >>>>>>
> > > >>>>>> Oh, I also tried to disable OpenGL in the compositor settings and
> > > >>>>>> choosing the software option. And it still crashes!
> > > >>>>>>
> > > >>>>>> For starters I was hoping that I can just switch off all the window
> > > >>>>>> transition-fading eye-candy but I did not understand whether this
> > > >>>>>> is possible. Is it?
> > > >>>>>>
> > > >>>>>> Finally, being a desktop system (my first in like 10 years or so)
> > > >>>>>> it does not run an acpi daemon. I don't really see any reason to
> > > >>>>>> do so. Therefore E also complains on every startup that no acpi
> > > >>>>>> daemon can be found. I did not find any compile time or runtime
> > > >>>>>> options to disable acpi. Is there a way to silence this
> > > >>>>>> error/warning?
> > > >>>>>>
> > > >>>>>> Cheers,
> > > >>>>>> Florian
> 
> 
> _______________________________________________
> enlightenment-users mailing list
> enlightenment-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-users


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - ras...@rasterman.com



_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to