Carsten Haitzler (The Rasterman) wrote: > On Thu, 20 Oct 2005 17:09:25 +1300 jochen <[EMAIL PROTECTED]> babbled: > > >>Carsten Haitzler (The Rasterman) wrote: >> >>>On Wed, 19 Oct 2005 18:40:43 +0900 Carsten Haitzler (The Rasterman) >>><[EMAIL PROTECTED]> babbled: >>> >>> >>> >>>>On Wed, 19 Oct 2005 22:16:49 +1300 jochen <[EMAIL PROTECTED]> >>>>babbled: >>>> >>>> >>>> >>>>>Carsten Haitzler (The Rasterman) wrote: >>>>> >>>>> >>>>>>On Tue, 18 Oct 2005 20:37:41 +1300 jochen <[EMAIL PROTECTED]> >>>>>>babbled: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Carsten Haitzler (The Rasterman) wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>>On Tue, 18 Oct 2005 20:13:40 +1300 jochen <[EMAIL PROTECTED]> >>>>>>>>babbled: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>Carsten Haitzler (The Rasterman) wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>On Tue, 18 Oct 2005 17:30:16 +1300 jochen <[EMAIL PROTECTED]> >>>>>>>>>>babbled: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>jochen wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Hi guys, >>>>>>>>>>>>I'm have another segfault. CVS of today. It happens when I close an >>>>>>>>>>>>eterm with alt-right -> close. happens every time. Closing with >>>>>>>>>>>>ctrl-alt-x or the close button works however. and it seems to only >>>>>>>>>>>>happen with eterm of the apps I tried. Here is the backtrace >>>>>>>>>>>>Cheers >>>>>>>>>>>>Jochen >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>correction, it also happens with gnome-terminal. However only when >>>>>>>>>>>Eterm/gterm is started from menu or ibar. when started from another >>>>>>>>>>>terminal they close fine >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>are you using any modules not shipped with e17? (engage etc.) ? >>>>>>>>>> >>>>>>>>> >>>>>>>>>No turned them all off, do you need more info? >>>>>>>> >>>>>>>> >>>>>>>>ok - one thing. go to the e17 src: >>>>>>>> >>>>>>>>make clean distclean >>>>>>>>./configure (whatever options) >>>>>>>>make >>>>>>>>make install >>>>>>>> >>>>>>>>again - just in case. basically this backtrace makes no sense as bd->app >>>>>>>>shoudl be a valid pointer or NULL as i read the code in front of me. the >>>>>>>>value it has is really bogus. >>>>>>>> >>>>>>> >>>>>>>Still the same, flags are CFLAGS="-g -O2 -march=pentium4" so nothing >>>>>>>special. I just checked if there's an old installation floating around >>>>>>>somewhere just in case, but nothing there. >>>>>> >>>>>> >>>>>>grr - that shouldnt be that value (0x368 for the object pointer). thats >>>>>>like a completely bogus value and i cant see how it happens... UNLESS the >>>>>>border is being passed into a functiont hat expects a different type. i >>>>>>shoudl likely go thru all objects and add type checks - i may catch it. >>>>>>but what baffles me the border is the last struct member - and borders >>>>>>are like the largest structs in e17 - so nothng shoudl be able to >>>>>>overwrite it. >>>>>> >>>>>>ok. here is something i might suggest. >>>>>> >>>>>>start e under gdb (From another machine/console). >>>>>>nos start an xterm or 2 >>>>>>NOW >>>>>>ctrl-c and set a breakpoint for e_border_new >>>>>> >>>>>>NOW continue the program. >>>>>> >>>>> >>>>>>from an xterm run another program (xterm, gnome-terminal - doesnt matter) >>>>> >>>>>>e shoudl freeze as the breakpoint is caught >>>>>>go back to gdb >>>>>>and step thrugh e_border_new >>>>>>until it has allocated the bd struct. NOW. set a watch for bd->app and >>>>>>continue. >>>>>> >>>>>>what shoudl happen is that e should then continue and trap again - print >>>>>>bd->app when it traps. it should be valid ( a normal looking pointer) - it >>>>>>ma trigger 2 or 3 times actually - but as long as its with valid pointers >>>>>>we are ok. now close this new window as u did - hopefully the watch point >>>>>>shoudl get triggered every time it does do a backtrace. one of them must >>>>>>be setting it to this bogus value. if you can get a log of all of that - >>>>>>we'll find the one that does it. (i hope). >>>>>> >>>>> >>>>>OK I have done that, I never got to the point of closing the terminal. >>>>>However bd->app is set to the wrong value when opening the window >>>>>already. Attached are 3 gdb logs, the first one I step through watching >>>>>bd->app until it is set to the fishy value. Number two I set >>>>>e_focus_setup as a breakpoint as that was the last time bd->app was set >>>>>before the weird value. I got a corrupted stack message at somepoint so >>>>>I could not continue. In the third log a continued after bd->app was set >>>>>to the value, and got a corrupt stack message a little after that. I >>>>>hope these logs are somewhat useful, I'm quite new to the whole >>>>>debugging business so if I need to do something differently just wack me >>>>>with a clue bat. >>>>>Cheers >>>>>Jochen >>>> >>>>hmm - almost perfect. when u get a watchpoint trap - can you do a bt as well >>>>at that time (so i can see the call tree/history of that watchpoint trap >>>>point) :) >>> >>> >>>oh - can you also do a list of the code (the list command). it seems right >>>now that the watch point traps you are getting are bogus as it doesnt make >>>sense where gdb is trapping - at all. hmmm. >>> >> >>ok I have done another backtrace, and I am totally lost with my >>debugging knowledge. I would really like to understand a little better >>so I actually know what I am doing next time. I have done backtraces and >>lists at every watchpointtrap. Just before the point where the problem >>occurs i have started to going through the code with step. But I don't >>really get this. Do I understand backtraces correctly that #0 is the >>function that was being executed which was called by function #1 etc? So >>from the the log e_focus_setup is being called from e_border_new with bd >>(border ?) parameter. This then calls ecore_window_button_grab but with >>bd->win, so how is bd->app modified? Is this about right? > > > yes that's right fn #0 was called by #1, and #1 called by #2 etc. basically > now reading this backctrace... it makes no sense. bd->app cannot be modified > where gdb s sayoing it is. it can't be. i suspect optimisations are causign > the > debugging to screw up. > > so compile evas, ecore, edje, eet, embryo ANd enlightenment with tese CFLAGS: > > -g3 -O0 -ggdb > > and make install again - and try this again. :( i'm still not seeing any sense > in your backtraces here :( >
OK, I just rebuild all of EFL and E with the above flags and the error is gone, so it seems it was a optimization bug, and I thought I was conservative with the optimization. Thanks for the explanations. Well if I find some time I might even dig into it to find out which compiler option it is, not tonight though, I'll go see the kiwis whip some kangoroo ass ;-) (for those who don't know what the heck I'm talking about, I'm going to watch the Australia against New Zealand rugby league game) ------------------------------------------------------- This SF.Net email is sponsored by: Power Architecture Resource Center: Free content, downloads, discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel