On Apr 30, 2012, at 7:13 AM, Gustau Pérez i Querol wrote:
>  the kde team is seeing some strange problems with the new version (4.8.1) of 
> devel/dbus-qt4 with current. It does work with stable. I also suspect that 
> the problem described below is affecting the experimental cinnamon port (an 
> alternative to gnome3, possible replacement of gnome2).
> 
>  The problem happens with both i386 and amd64 with empty /etc/malloc.conf and 
> simple /etc/make.conf. Everything compiled with base gcc (no clang). The 
> kernel was compiled with no debug support, but it can enable if needed. There 
> are reports from avi...@freebsd.org of the same behavior with clang compiled 
> world and kernel and with   MALLOC_PRODUCTION=yes.
> 
> When qdbus starts, it segfauts. The backtrace of the problem with r234769 can 
> be found here: http://pastebin.com/ryBXtqGF. When starting the qdbus daemon 
> by hand in a X+twm session, we see it calls calloc many times and after a 
> fixed number of times segfaults. We see it segfaults at rb_gen (a quite large 
> macro defined at $SRC_BASE/contrib/jemalloc/include/jemalloc/internal/rb.h).
> 
> If the daemon is started by hand, I'm able to skip all the calls qdbus makes 
> to calloc till the one causing the segfault. At that point, at rb_gen, we 
> don't exactly know what is going on or how to debug the macro. Ktrace are 
> available, but we were unable to find anything new from them.
> 
>  With old versions of current before the jemalloc imports (as of March 30th) 
> the daemon segfaulted at malloc.c:2426. With revisions during April 20 to 
> 24th (can be more precise, it was during the jemalloc imports) the daemon 
> segfaulted at malloc_init. Bts are available if needed, and if necessary I 
> can go back to those revision and recompile world+kernel to see its behavior.
> 
>  Any help from freebsd-current@ (perhaps Jason Evans can help us) will be 
> appreciated. Any additional info, like source revisions, can be provided. I 
> would like to stress that the experimental devel/dbus-qt4 works fine with 
> recent stable.

The crash is happening in page run management, so there is some pretty bad 
memory corruption going on by the time of the crash.  If I understand you 
correctly, you have reproduced the crash on a system that does *not* have 
MALLOC_PRODUCTION defined, which means that none of the assertions in jemalloc 
caught the problem.

Adrian Chadd made the excellent suggestion of trying valgrind; it's likely to 
point out the problem almost immediately.  If that doesn't work, the utrace 
functionality in malloc may help you figure out what activity has occurred by 
the time of the crash, and give you a better understanding of what happened to 
memory around the address that is involved in the crash.

Jason_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to