I think we've got a second, independent issue with procstat here. This time it 
seems to me your friendly string-buffer overflow. Incidentally triggered by a 
long command line in terminology while compiling the latest enlightenment. ;)


https://pastebin.com/Ue03AbmB


Cheers,

Florian


"Al Poole" nets...@gmail.com – September 6, 2021 1:22 AM
> Summoned
>
> On Sun, 5 Sep 2021, 14:12 Carsten Haitzler, <ras...@rasterman.com> wrote:
>
> > On Sun, 5 Sep 2021 11:25:35 +0900 Florian Schaefer <list...@netego.de>
> > said:
> >
> > > OK, the ibox patch seemed to resolve this issue.
> > > Thank you very much! :-)
> > >
> > > But. As you proposed I started to play with ASAN ... and opened quite a
> > > can of worms apparently. E is now rather constantly crashing. I guess
> > > this is because of the "abort_on_error=1" setting of ASAN and it's,
> > > well, finding many memory leaks. So I hope we can squash them one by one.
> >
> > export
> >
> > ASAN_OPTIONS="detect_odr_violation=0:detect_leaks=0:abort_on_error=1:new_delete_type_mismatch=0"
> >
> > :) it will only barf on real memory erros - not smaller things that don't
> > cause
> > crashes. for leaks i'm more interested in using massif for that, but they
> > wont
> > cause crashes so those are "worry about another day" if anything.
> >
> > > First I want to say that I needed to add "log_path=asan.log" to the
> > > ASAN_OPTIONS variable in order to have the asan output actually written
> > > somewhere, so I would propose to add this information to the
> > > enlightenment homepage. Most users nowadys probably don't start E from a
> > > terminal where any stdout would be visible.
> >
> > actually i just redirect ALL stdout/err from e to ~/.xsession-errors so
> > that
> > handles it anyway :) you won't need to do the above special asan log if
> > you're
> > dloing that and i'd generally say it's a smart move. if you don't you can
> > also
> > check your journald logs from systemd etc.
> >
> > > So I tried to capture one of the crashes as best as I could with both
> > > gdb and asan. This one seemed to be in the procstats module. The result
> > > is here: pastebin.com/M6V2QTwd
> >
> > ooh procstats... i do not run that, so that probably explains why i don't
> > see
> > this...
> >
> > /me summons a netstar
> >
> >
> >
> > > Also, now E brings an additional error popup when returning from the
> > > lock screen: "Authentication via PAM had errors setting up the
> > > authentication session. The error code was 6." This did not happen
> > > before the recompiling. So I was suspecting that this is somehow due to
> > > ASAN so I tried to remove the ASAN_OPTIONS from the .xsessionrc. But it
> > > seems that without this variable E won't even start now. I see the
> > > processes in the process list but the screen remains just black.
> > > Therefore back to ASAN it is. Also I could not find any related messages
> > > in auth.log or similar. Very strange and somewhat unsettling.
> >
> > aaaah yes. i think error code is changing because asan detects something
> > e.g.
> > like a leak on shutdown of the ckpasswd slave binary thus making this not
> > work.
> > basically "don't rely on desklock to work right" if using asan. kind of a
> > "gotcha".
> >
> > > Concerning the ACPI daemon. I see, this seems to be a "hard" requirement
> > > of E then. Interesting design choice. For me personally running an ACPI
> >
> > It's a soft requirement. E works without BUT you will be missing events for
> > things like: lid open/close, some power/reset buttons being pressed, ac
> > adaptor
> > plug/unplug ... e will check if your system has acpi at all - if it does it
> > will want events from acpid to handle these. it may be you are lucky and
> > don't
> > need these (eg only have a power button - you already getkey press for it
> > and
> > no reset button, no lid, no ac adapter/battery), but e will basically
> > insist
> > this runs because you have these as possible events. it's a trivially small
> > daemon to run and every distro i know of has it, so not much to just go do
> > this. i added this because people complained e didn't suspend their laptop
> > on
> > lid close and it ended up they didn't follow the recommendation of having
> > acpid
> > to handle that. this is there because people don't follow docs so now it's
> > pushing it on everyone to avoid things like a laptop in your backpack
> > running
> > and overheating and running your entire battery empty in a few hours.
> >
> > > daemon on a desktop system has exactly zero additional benefit. The
> > > power button is handled by systemd just fine and I am happy for every
> >
> > actually it's not. e inhibits systemd handling this - always. no choice.
> > when e
> > runs system will ignore this. e is handling it. it can handle it either
> > via a
> > x11 power key press event OR an acpi button press. see above. :)
> >
> > > unnecessary daemon that I can prevent from cluttering my ps output. So,
> > > anyway, for now I just commented out the callback to the popup. Works
> > > great. ;-)
> >
> > see above. too many times people don't follow the recommendations, so now
> > forcing it on everyone. i have considered adding acpi support to
> > enlightenment_system that runs as root, but i haven't done that so until
> > then ... you need acpid. :)
> >
> > > Cheers
> > > Florian
> > >
> > > On 9/5/21 6:27 AM, Carsten Haitzler wrote:
> > > > On Sat, 4 Sep 2021 17:52:09 +0900 Florian Schaefer <list...@netego.de>
> > said:
> > > >
> > > >> On 9/4/21 4:55 PM, Carsten Haitzler wrote:
> > > >>> On Sat, 4 Sep 2021 11:47:20 +0900 Florian Schaefer <
> > list...@netego.de>
> > > >>> said:
> > > >>>
> > > >>>> Raster,
> > > >>>>
> > > >>>> Thanks for the quick reply and help!
> > > >>>>
> > > >>>> OK, so ibox seems to be the culprit. With the module unloaded I was
> > not
> > > >>>> able to crash the system. That's quite interesting, on my personal
> > > >>>> machine I am using ibox ever since and never had any issues (just
> > like
> > > >>>> your test yesterday). So this seems to be somehow specific to my new
> > > >>>> system here.
> > > >>>>
> > > >>>> Anyway, thanks for pointing me into the right direction. With this
> > I now
> > > >>>> also finally understood how to identify which one of the many
> > threads
> > > >>>> was the segfaulting one. ;-)
> > > >>>>
> > > >>>> Now for the backtrace. As it is quite short I will paste it below
> > > >>>>
> > > >>>> ========================================
> > > >>>> (gdb) bt
> > > >>>> #0 0x00007f23b417f872 in __libc_pause () at
> > > >>>> ../sysdeps/unix/sysv/linux/pause.c:29
> > > >>>> #1 0x0000564440d159f7 in e_alert_show () at ../src/bin/e_alert.c:43
> > > >>>> #2 0x0000564440cda47a in _e_crash () at ../src/bin/e_signals.c:81
> > > >>>> #3 0x0000564440cda4a9 in e_sigseg_act (x=<optimized out>,
> > > >>>> info=<optimized out>, data=<optimized out>) at
> > ../src/bin/e_signals.c:91
> > > >>>> #4 0x00007f23b4180140 in <signal handler called> () at
> > > >>>> /lib/x86_64-linux-gnu/libpthread.so.0
> > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
> > > >>>> ../src/modules/ibox/e_mod_main.c:636
> > > >>>> #6 0x00007f23a57df330 in _ibox_cb_icon_fill_timer (data=<optimized
> > > >>>> out>) at ../src/modules/ibox/e_mod_main.c:526
> > > >>>> #7 0x00007f23b4c25581 in _ecore_call_task_cb (data=<optimized out>,
> > > >>>> func=<optimized out>) at ../src/lib/ecore/ecore_private.h:456
> > > >>>> #8 _ecore_timer_legacy_tick (data=0x564441cbf230,
> > event=0x7ffd43c61150)
> > > >>>> at ../src/lib/ecore/ecore_timer.c:172
> > > >>>> #9 0x00007f23b3b1c130 in _event_callback_call
> > (obj_id=0x400000379067,
> > > >>>> pd=0x5644412371e0, desc=0x7f23b4c521e0
> > > >>>> <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>, event_info=<optimized out>,
> > > >>>> legacy_compare=legacy_compare@entry=0 '\000') at
> > > >>>> ../src/lib/eo/eo_base_class.c:2114
> > > >>>> #10 0x00007f23b3b1c3ec in _efl_object_event_callback_call
> > > >>>> (obj_id=<optimized out>, pd=<optimized out>, desc=<optimized out>,
> > > >>>> event_info=<optimized out>) at ../src/lib/eo/eo_base_class.c:2186
> > > >>>> #11 0x00007f23b3b16620 in efl_event_callback_call (obj=<optimized
> > out>,
> > > >>>> desc=desc@entry=0x7f23b4c521e0 <_EFL_LOOP_TIMER_EVENT_TIMER_TICK>,
> > > >>>> event_info=event_info@entry=0x0) at
> > ../src/lib/eo/eo_base_class.c:2189
> > > >>>> #12 0x00007f23b4c26e15 in _efl_loop_timer_expired_call
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> when=when@entry=436613.23437423998) at
> > ../src/lib/ecore/ecore_timer.c:669
> > > >>>> #13 0x00007f23b4c26f43 in _efl_loop_timer_expired_timers_call
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> when=436613.23437423998) at ../src/lib/ecore/ecore_timer.c:621
> > > >>>> #14 0x00007f23b4bf2fae in _ecore_main_loop_iterate_internal
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460,
> > > >>>> once_only=once_only@entry=0) at ../src/lib/ecore/ecore_main.c:2431
> > > >>>> #15 0x00007f23b4bf383f in _ecore_main_loop_begin
> > > >>>> (obj=obj@entry=0x40000000012d, pd=pd@entry=0x5644411fd460) at
> > > >>>> ../src/lib/ecore/ecore_main.c:1231
> > > >>>> #16 0x00007f23b4bf7e6d in _efl_loop_begin (obj=0x40000000012d,
> > > >>>> pd=0x5644411fd460) at ../src/lib/ecore/efl_loop.c:57
> > > >>>> #17 0x00007f23b4bf7233 in efl_loop_begin (obj=0x40000000012d) at
> > > >>>> src/lib/ecore/efl_loop.eo.c:28
> > > >>>> #18 0x00007f23b4bf390c in ecore_main_loop_begin () at
> > > >>>> ../src/lib/ecore/ecore_main.c:1316
> > > >>>> #19 0x0000564440cb8c50 in main (argc=<optimized out>,
> > argv=<optimized
> > > >>>> out>) at ../src/bin/e_main.c:1121
> > > >>>>
> > > >>>> (gdb) fr 5
> > > >>>> #5 0x00007f23a57df211 in _ibox_icon_fill (ic=0x5644419a2910) at
> > > >>>> ../src/modules/ibox/e_mod_main.c:636
> > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>>
> > > >>>> (gdb) list
> > > >>>> 631 }
> > > >>>> 632
> > > >>>> 633 static void
> > > >>>> 634 _ibox_icon_fill(IBox_Icon *ic)
> > > >>>> 635 {
> > > >>>> 636 if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>> 637 _ibox_icon_fill_preview(ic, EINA_FALSE);
> > > >>>> 638 else
> > > >>>> 639 _ibox_icon_fill_icon(ic);
> > > >>>> 640
> > > >>>>
> > > >>>> (gdb) print ic
> > > >>>> $1 = (IBox_Icon *) 0x5644419a2910
> > > >>>>
> > > >>>> (gdb) print *ic
> > > >>>> $2 = {ibox = 0x564441cc3fe0, o_holder = 0x0, o_icon = 0x0,
> > o_holder2 =
> > > >>>> 0x0, o_icon2 = 0x0, client = 0x0, drag = {start = 0 '\000', dnd = 0
> > > >>>> '\000', x = 0, y = 0, dx = 0, dy = 128}}
> > > >>>>
> > > >>>> (gdb) print *(ic->ibox)
> > > >>>> $3 = {inst = 0x40, o_box = 0xe1, o_drop = 0x564441a499b0,
> > o_drop_over =
> > > >>>> 0x7f23b4165cb0 <main_arena+304>, o_empty = 0x7474756200726162,
> > > >>>> ic_drop_before = 0x81646c3698761235, drop_before = 1103904792,
> > icons =
> > > >>>> 0x0, zone = 0x698761254, dnd_x = 0, dnd_y = 1769170290}
> > > >>>>
> > > >>>> (gdb) print *(ic->ibox->inst)
> > > >>>> Cannot access memory at address 0x40
> > > >>>> ========================================
> > > >>>>
> > > >>>> So somehow we've got some garbage pointer in ic->ibox->inst.
> > > >>>
> > > >>> actualluy.. ic->ibox is junk. iut happens to point to some memory we
> > can
> > > >>> access but it's full of ... garbage. like dnd_y is and unrealistic
> > coord.
> > > >>> zone does not look like a proper pointer (o_drop does) and o_box is
> > > >>> nothing like what a pointer should look like. drop_before seems junky
> > > >>> too. so ... what happened to ic->ibox? or ... for that matter what
> > > >>> happened to ic? maybe ic has been freed and now the ibox ptr has been
> > > >>> overwritten to point to some junk as i cant imagine the ibox struct
> > being
> > > >>> freed as that struct is still there for the ibox gadget. so ...
> > > >>
> > > >> Ah I see. It certainly makes debugging easier if you know what a
> > pointer
> > > >> is supposed to look like. :-)
> > > >>
> > > >>> well turning on ASAN (search enlightenment.org for asan and how to
> > enable
> > > >>> it) in efl and e would probably instantly point out the problem. you
> > can
> > > >>> try that as an exercise in being able to divine better debug info
> > from
> > > >>> efl
> > > >>> + e. it's pretty easy now with meson.... :) unlike valgrind it's not
> > > >>> prohibitively slow either. it's usable day to day on a fast enough
> > > >>> machine.
> > > >>
> > > >> Interesting. Thanks for the pointer to new debugging tools. (And yes,
> > > >> valgrind is really slow.) I found the documentation you mentioned. I
> > > >> think I will give it a try before applying your patch, just to see
> > what
> > > >> happens and to be able to play around with it for a bit.
> > > >>
> > > >>> and i can see the problem:
> > > >>>
> > > >>> ecore_timer_add(0.1, _ibox_cb_icon_fill_timer, ic);
> > > >>>
> > > >>> a timer is created to fill the icon in 0.1 sec... but ... imagine
> > the icon
> > > >>> (ic) has been freed/deleted BEFORE the timer fires... in 0.1sec from
> > > >>> now. ... someone added a timer without remembering to delete it when
> > the
> > > >>> icon the timer is for is deleted! a bit sloppy...
> > > >>
> > > >> Bad boy. ;-)
> > > >>
> > > >> This means that on my old laptop I never ran into any issues because
> > it
> > > >> is just too slow for this race condition to occur?
> > > >>
> > > >>> d12acf0d01e628d71548adbb77670c7e40aef043 commit in git now fixes
> > that.
> > > >>> problem is in e ... not efl :)
> > > >>
> > > >> Great. Thanks! As said before, I will try to tackle this with ASAN
> > first
> > > >> for training and then see how your solution is holding up. That will
> > > >> hopefully be tomorrow.
> > > >>
> > > >> Now to the second point of my first mail from yesterday: Is there any
> > > >> way for me to disable/silence the error popup on startup that no ACPI
> > > >> daemon is running?
> > > >
> > > > oh yes. install acpid and have it run. :)
> > > >
> > > >> Cheers,
> > > >> Florian
> > > >>
> > > >>>> I tried to poke into the preceding frames (#6 and #7) but only hit
> > > >>>> optimized out variables. This is efl territory, right? This morning
> > I
> > > >>>> recompiled enlightenment with "-O0 -g" but I guess I should also
> > have
> > > >>>> done the same to efl. Well, I can do this the next time I'm in
> > office if
> > > >>>> helpful.
> > > >>>>
> > > >>>> Any ideas?
> > > >>>>
> > > >>>> For now I gave ibar a try. Not exactly a replacement for me. I don't
> > > >>>> need a launcher (using everything and favorites menu instead) or a
> > > >>>> tracker of running windows (I know what windows I have open). I only
> > > >>>> need something to show my minimized windows so that I can open them
> > > >>>> again (I know, they appear with Alt+Tab...) and this seems to be the
> > > >>>> only scenario that cannot be reproduced by ibar. -- I guess I never
> > > >>>> bought into the MacOS style launcher bar. ;-)
> > > >>>
> > > >>> ibar will show both running and minimized icons for windows .. but
> > ok -
> > > >>> yeah - it doesnt "show only minimized"... :)
> > > >>>
> > > >>>> Cheers
> > > >>>> Florian
> > > >>>>
> > > >>>> On 9/4/21 1:25 AM, Carsten Haitzler wrote:
> > > >>>>> On Fri, 3 Sep 2021 21:04:35 +0900 Florian Schaefer <
> > list...@netego.de>
> > > >>>>> said:
> > > >>>>>
> > > >>>>> quick - if you unload the ibox module ... does the problem stop?
> > that
> > > >>>>> crash is inside ibox code - memory it's accessing is bad/wrong -
> > why i
> > > >>>>> don't know. not more information. like 363 in ibox is:
> > > >>>>>
> > > >>>>> if ((ic->ibox->inst->ci->show_preview) &&
> > > >>>>> (edje_object_part_exists(ic->o_holder, "e.swallow.preview")))
> > > >>>>>
> > > >>>>> so what is ic? whats is ic->ibox, ic->ibox->inst,
> > ic->ibox->inst->ci ?
> > > >>>>>
> > > >>>>> if you attach gdb when e crashes and dump these values - i'd know
> > more.
> > > >>>>> maybe. I actually stopped using ibox a while ago since ibar does
> > both
> > > >>>>> effectively these days. perhaps it is an ibox bug and i havent
> > seen it
> > > >>>>> as i dont use it. so try the above, if it goes away - attach gdb
> > > >>>>>
> > > >>>>> i can say that i dont see the problem here with ibox enabled and
> > on amd
> > > >>>>> + e (git).
> > > >>>>>
> > > >>>>>> Dear everyone,
> > > >>>>>>
> > > >>>>>> so I got a new desktop PC at work and the first thing I did, of
> > course,
> > > >>>>>> was to install Debian sid and enlightenment-git. ;-)
> > > >>>>>>
> > > >>>>>> The machine has a Nvidia T600 card and this is where troubles
> > probably
> > > >>>>>> begin. As I kind of need the graphics performance for CAD I went
> > with
> > > >>>>>> the drivers from Nvidia (the stock open source drivers were
> > terribly
> > > >>>>>> slow).
> > > >>>>>>
> > > >>>>>> Now what happens is that enlightenment crashes often. Like kind of
> > > >>>>>> constantly. I got the impression it happens mostly when several
> > windows
> > > >>>>>> are going through their appearance fade-in transition at the same
> > time.
> > > >>>>>> Then the "red screen of death" appears and I need to press F1 to
> > > >>>>>> continue. With some applications this happens always (Eagle
> > anyone?)
> > > >>>>>> with others only sometimes. After the forced restart many windows
> > (e.g.
> > > >>>>>> terminology always, firefox sometimes) need to be minimized and
> > > >>>>>> uncovered again for their content to display again. Some dialog
> > windows
> > > >>>>>> won't even show their content from the beginning and instead just
> > some
> > > >>>>>> different portion of the screen. Needless to say that for a
> > machine at
> > > >>>>>> work this is not an optimal situation.
> > > >>>>>>
> > > >>>>>> The most pressing issue are of course the crashes. I recompiled
> > > >>>>>> everything with debugging symbols and optimization disabled (or at
> > > >>>>>> least I thought so, some things seem still to be optimized away)
> > to
> > > >>>>>> get some meaningful dumps. One of which I uploaded to pastebin
> > > >>>>>> (pastebin.com/YWSarC10) hoping that it makes sense to
> > someone.
> > > >>>>>>
> > > >>>>>> I am sure that it is not E that is "at fault" but Nvidia, but for
> > now I
> > > >>>>>> need to find a way around this so that I can work without having
> > to
> > > >>>>>> reset everything every five minutes. Any ideas?
> > > >>>>>>
> > > >>>>>> Oh, I also tried to disable OpenGL in the compositor settings and
> > > >>>>>> choosing the software option. And it still crashes!
> > > >>>>>>
> > > >>>>>> For starters I was hoping that I can just switch off all the
> > window
> > > >>>>>> transition-fading eye-candy but I did not understand whether this
> > is
> > > >>>>>> possible. Is it?
> > > >>>>>>
> > > >>>>>> Finally, being a desktop system (my first in like 10 years or so)
> > it
> > > >>>>>> does not run an acpi daemon. I don't really see any reason to do
> > so.
> > > >>>>>> Therefore E also complains on every startup that no acpi daemon
> > can be
> > > >>>>>> found. I did not find any compile time or runtime options to
> > disable
> > > >>>>>> acpi. Is there a way to silence this error/warning?
> > > >>>>>>
> > > >>>>>> Cheers,
> > > >>>>>> Florian


_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to