Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Sat, Jul 16, 2011 at 10:42:22PM -0700, Doug Barton wrote: On 07/15/2011 01:40, Marius Strobl wrote: The generated config.h and platform.h for sparc64 are these: http://people.freebsd.org/~marius/bind96_config.h http://people.freebsd.org/~marius/bind96_platform.h Marius, Thanks again for all your help on this. During the work to upgrade to BIND 9.8 in HEAD I first tried your patch but I got some odd errors on some of the non-mainstream archs, so I ultimately went with something similar to what you sent but much more conservative. Thanks! Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On 07/15/2011 01:40, Marius Strobl wrote: The generated config.h and platform.h for sparc64 are these: http://people.freebsd.org/~marius/bind96_config.h http://people.freebsd.org/~marius/bind96_platform.h Marius, Thanks again for all your help on this. During the work to upgrade to BIND 9.8 in HEAD I first tried your patch but I got some odd errors on some of the non-mainstream archs, so I ultimately went with something similar to what you sent but much more conservative. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Thu, Jul 14, 2011 at 05:31:49PM -0700, Doug Barton wrote: On 07/14/2011 16:21, Marius Strobl wrote: On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote: 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com: Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD, and it worked properly till Sun Jul 10 22:25:41 MSD. At 22:25:41 I restarted bind from base system with your sparc64_isc_atomic.h.diff2. From this moment till today, 15:57:05 he crashed 3 times: Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 To make to ensure proper operation of bind from ports, I ran it again at 15:57:05, and, I think, we need to wait several days. And from that time till now bind from ports never died and works properly... Okay. Doug, could you please disable the use of atomic operations for sparc64 in the in-tree BIND via the following patch in order to match what the vendor source does? http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff If you use the port and do 'make configure' are the values in config.h the same as the ones in your patch? If so, that's likely to be the right answer, and I'll go ahead and apply your patch. The generated config.h and platform.h for sparc64 are these: http://people.freebsd.org/~marius/bind96_config.h http://people.freebsd.org/~marius/bind96_platform.h Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote: 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com: Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD, and it worked properly till Sun Jul 10 22:25:41 MSD. At 22:25:41 I restarted bind from base system with your sparc64_isc_atomic.h.diff2. From this moment till today, 15:57:05 he crashed 3 times: Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 To make to ensure proper operation of bind from ports, I ran it again at 15:57:05, and, I think, we need to wait several days. And from that time till now bind from ports never died and works properly... Okay. Doug, could you please disable the use of atomic operations for sparc64 in the in-tree BIND via the following patch in order to match what the vendor source does? http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff I've no idea why they don't work properly (apart from the fact that there additionally should be memory barriers at least when used for reference counting just like the alpha version of the ISC atomic operations uses), I just can say they match what we use in the kernel without problems pretty closely and that they work as described in the respective comments when testing them stand-alone. So my best guess is that the BIND source additionaly depends on some x86-specific behavior of the atomic operations there or in general, but from a glance the source it's not obvious for me what that could be. Given that the vendor source doesn't even use atomic operations on Solaris/SPARC I suspect this is a non-trivial problem. It probably would be a good idea to also disable the use of atomic operations for arm again just like the vendor source does as they don't work there either but nobody seems to care (see PR 154306). Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On 07/14/2011 16:21, Marius Strobl wrote: On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote: 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com: Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD, and it worked properly till Sun Jul 10 22:25:41 MSD. At 22:25:41 I restarted bind from base system with your sparc64_isc_atomic.h.diff2. From this moment till today, 15:57:05 he crashed 3 times: Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 To make to ensure proper operation of bind from ports, I ran it again at 15:57:05, and, I think, we need to wait several days. And from that time till now bind from ports never died and works properly... Okay. Doug, could you please disable the use of atomic operations for sparc64 in the in-tree BIND via the following patch in order to match what the vendor source does? http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff If you use the port and do 'make configure' are the values in config.h the same as the ones in your patch? If so, that's likely to be the right answer, and I'll go ahead and apply your patch. Thanks, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/7/11 KOT MATPOCKuH matpoc...@gmail.com: Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. I started named from ports (dns/bind96) at Sat Jul 9 10:08:41 MSD, and it worked properly till Sun Jul 10 22:25:41 MSD. At 22:25:41 I restarted bind from base system with your sparc64_isc_atomic.h.diff2. From this moment till today, 15:57:05 he crashed 3 times: Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 To make to ensure proper operation of bind from ports, I ran it again at 15:57:05, and, I think, we need to wait several days. And from that time till now bind from ports never died and works properly... -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/7/8 Marius Strobl mar...@alchemy.franken.de: In order to have a result which can be compared with the base BIND. Whether bind98 works or works without the ISC atomic operations says nothing about the bind96 port or the base version. Okey... Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. I started named from ports (dns/bind96) at Sat Jul 9 10:08:41 MSD, and it worked properly till Sun Jul 10 22:25:41 MSD. At 22:25:41 I restarted bind from base system with your sparc64_isc_atomic.h.diff2. From this moment till today, 15:57:05 he crashed 3 times: Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 To make to ensure proper operation of bind from ports, I ran it again at 15:57:05, and, I think, we need to wait several days. -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/7/7 Marius Strobl mar...@alchemy.franken.de: That's not the patch I was referring to. I did a second one which just entirely disables the use of atomic operations on sparc64: http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff Omg. I'm sorry. I applied this patch and restarted named, but named crashed immediatly after start: 08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads 08-Jul-2011 15:29:54.633 using up to 4096 sockets Segmentation fault (core dumped) core's backtrace: #0 0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7 (gdb) bt #0 0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7 #1 0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7 #2 0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7 #3 0x409537ac in __sparc_utrap_install () from /lib/libc.so.7 #4 0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3 #5 0x00228dcc in ?? () Previous frame identical to this frame (corrupt stack?) Could this be a sign to a problem in libthr? PS. Also one month ago I got a problems with another multithreaded application from ports (www/oops). oops was crashed with stack's backtrace: #0 0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7 #1 0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7 #2 0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7 #3 0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7 #4 0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7 #5 0x40c039b8 in fork () from /lib/libthr.so.3 #6 0x40c03d38 in fork () from /lib/libthr.so.3 #7 0x40c03f50 in pthread_exit () from /lib/libthr.so.3 #8 0x40c04414 in pthread_detach () from /lib/libthr.so.3 #9 0x40c04710 in pthread_create () from /lib/libthr.so.3 But on yesterday's world's build oops works properly. I think it may be related to r223228 (?) Or I incorrectly analyze stack for multithreaded applications? -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Fri, Jul 08, 2011 at 03:47:08PM +0400, KOT MATPOCKuH wrote: 2011/7/7 Marius Strobl mar...@alchemy.franken.de: That's not the patch I was referring to. I did a second one which just entirely disables the use of atomic operations on sparc64: http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff Omg. I'm sorry. I applied this patch and restarted named, but named crashed immediatly after start: 08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads 08-Jul-2011 15:29:54.633 using up to 4096 sockets Segmentation fault (core dumped) core's backtrace: #0 0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7 (gdb) bt #0 0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7 #1 0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7 #2 0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7 #3 0x409537ac in __sparc_utrap_install () from /lib/libc.so.7 #4 0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3 #5 0x00228dcc in ?? () Previous frame identical to this frame (corrupt stack?) Could this be a sign to a problem in libthr? Could be but IMO that's unlikely, if there'd be a bug affecting pthread_mutex_lock() there should be more fallout from that. I'm probably missing something how to properly disable the use of the ISC atomic implementation and to enable the alternative locking. Please try the following: a) Instead of the base BIND use the dns/bind96 port. The native build of the latter defaults to not using the ISC atomic implementation on sparc64 (and arm) and should properly enable the alternative. I can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default configuration on -CURRENT without problems. b) Revert the above patch and try the base bind with the following (third) patch: http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2 That one adds the memory barriers required for reference counting albeit in a sledgehammer-like fashion as the ISC atomic API doesn't allow to distinguish between acquire and release semantics. PS. Also one month ago I got a problems with another multithreaded application from ports (www/oops). oops was crashed with stack's backtrace: #0 0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7 #1 0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7 #2 0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7 #3 0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7 #4 0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7 #5 0x40c039b8 in fork () from /lib/libthr.so.3 #6 0x40c03d38 in fork () from /lib/libthr.so.3 #7 0x40c03f50 in pthread_exit () from /lib/libthr.so.3 #8 0x40c04414 in pthread_detach () from /lib/libthr.so.3 #9 0x40c04710 in pthread_create () from /lib/libthr.so.3 But on yesterday's world's build oops works properly. I think it may be related to r223228 (?) Unlikely, the crash caused by the assertion in _malloc_thread_cleanup() was solved with r223369. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/7/8 Marius Strobl mar...@alchemy.franken.de: Please try the following: a) Instead of the base BIND use the dns/bind96 port. The native build of the latter defaults to not using the ISC atomic implementation on sparc64 (and arm) and should properly enable the alternative. I can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default configuration on -CURRENT without problems. dns/bind96? Why not bind98? As I see dns/bind98 configures without atomic swap operations. I will try to use dns/bind98 at first :) b) Revert the above patch and try the base bind with the following (third) patch: http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2 That one adds the memory barriers required for reference counting albeit in a sledgehammer-like fashion as the ISC atomic API doesn't allow to distinguish between acquire and release semantics. Hmmm... With this patch build fails: root@sunrise:/usr/src/lib/bind/dns# make cc -O2 -pipe -DVERSION='9.6.-ESV-R4-P3' -DHAVE_CONFIG_H -D_REENTRANT -D_THREAD_SAFE -DLIBINTERFACE=59 -DLIBREVISION=5 -DLIBAGE=1 -DOPENSSL -DUSE_MD5 -DWORDS_BIGENDIAN -DNS_LOCALSTATEDIR='/var' -DNS_SYSCONFDIR='/etc/namedb' -DNAMED_CONFFILE='/etc/namedb/named.conf' -DRNDC_CONFFILE='/etc/namedb/rndc.conf' -DRNDC_KEYFILE='/etc/namedb/rndc.key' -I/usr/src/lib/bind/dns/.. -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/bind9/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include/dst -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include -I/usr/src/lib/bind/dns/../dns -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isccc/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isccfg/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/unix/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/pthreads/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/include -I/usr/src/lib/bind/dns/../isc -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/lwres/unix/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/lwres/include -I/usr/src/lib/bind/dns/../lwres -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include/dst -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns -I/usr/src/lib/bind/dns -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/sparc64/include -std=gnu99 -c /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/acache.c {standard input}: Assembler messages: {standard input}:13: Error: Illegal operands: invalid membar mask name {standard input}:2180: Error: Illegal operands: invalid membar mask name *** Error code 1 Unlikely, the crash caused by the assertion in _malloc_thread_cleanup() was solved with r223369. Thanks you anyway! -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Fri, Jul 08, 2011 at 11:17:20PM +0400, KOT MATPOCKuH wrote: 2011/7/8 Marius Strobl mar...@alchemy.franken.de: Please try the following: a) Instead of the base BIND use the dns/bind96 port. The native build ? of the latter defaults to not using the ISC atomic implementation ? on sparc64 (and arm) and should properly enable the alternative. I ? can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default ? configuration on -CURRENT without problems. dns/bind96? Why not bind98? In order to have a result which can be compared with the base BIND. Whether bind98 works or works without the ISC atomic operations says nothing about the bind96 port or the base version. As I see dns/bind98 configures without atomic swap operations. I will try to use dns/bind98 at first :) b) Revert the above patch and try the base bind with the following ? (third) patch: ? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2 ? That one adds the memory barriers required for reference counting ? albeit in a sledgehammer-like fashion as the ISC atomic API doesn't ? allow to distinguish between acquire and release semantics. Hmmm... With this patch build fails: Oops, sorry, I forgot to revert the previous patch when test-compiling. Please re-fetch sparc64_isc_atomic.h.diff2 and try again. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3, but problem is still exists: 07-Jul-2011 13:24:22.765 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622: REQUIRE(prev 0) failed 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure) How can I find root cause of the problem? -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote: I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3, but problem is still exists: 07-Jul-2011 13:24:22.765 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622: REQUIRE(prev 0) failed 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure) How can I find root cause of the problem? From your description it's unclear whether you've built BIND with or without sparc64_isc_disable_atomic.diff. If it was built without that patch please give it a try. If you had applied it then this apparently is a generic bug in BIND and unrelated to the MD atomic implementation and I don't know how to proceed in order to get that fixed. Hopefully Doug can help you in that case. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/7/7 Marius Strobl mar...@alchemy.franken.de: On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote: I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3, but problem is still exists: 07-Jul-2011 13:24:22.765 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622: REQUIRE(prev 0) failed 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure) How can I find root cause of the problem? From your description it's unclear whether you've built BIND with or without sparc64_isc_disable_atomic.diff. If it was built without that patch please give it a try. As You can see, Doug is already included your patch in head: http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395r2=223811 And, of course, bind builded with your patch... If you had applied it then this apparently is a generic bug in BIND and unrelated to the MD atomic implementation and I don't know how to proceed in order to get that fixed. Hopefully Doug can help you in that case. Okey, I look forward to for guidance from Doug... -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Thu, Jul 07, 2011 at 03:44:32PM +0400, KOT MATPOCKuH wrote: 2011/7/7 Marius Strobl mar...@alchemy.franken.de: On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote: I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3, but problem is still exists: 07-Jul-2011 13:24:22.765 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622: REQUIRE(prev 0) failed 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure) How can I find root cause of the problem? From your description it's unclear whether you've built BIND with or without sparc64_isc_disable_atomic.diff. If it was built without that patch please give it a try. As You can see, Doug is already included your patch in head: http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395r2=223811 And, of course, bind builded with your patch... That's not the patch I was referring to. I did a second one which just entirely disables the use of atomic operations on sparc64: http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c o? sparc64/SMP
On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote: On 06/28/2011 08:58, Marius Strobl wrote: Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I haven't seen any verification from the OP that this patch solved the problem, It simply doesn't so apparently there's another bug in other parts of BIND causing it to trip over that assertion. Still, the clobber lists of the sparc64 atomic bits were incomplete and fixing that IMO was the right thing to do. however it did pass 'make universe' on both 9-current and RELENG_8, so I've committed it to those 2 branches along with the recent update. I'll also submit it upstream. Thanks! Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c o? sparc64/SMP
On Wed, Jul 06, 2011 at 11:55:15AM +0200, Marius Strobl wrote: On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote: On 06/28/2011 08:58, Marius Strobl wrote: Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I haven't seen any verification from the OP that this patch solved the problem, It simply doesn't so apparently there's another bug in other parts of BIND causing it to trip over that assertion. Still, the clobber lists of the sparc64 atomic bits were incomplete and fixing that IMO was the right thing to do. MATPOCKuH, could you please test the following patch? http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff That one simple disables the use of atomic operations for sparc64 as I doubt that these have seen much testing except on x86, be it on sparc64 or in general; given that they are also used for reference counting they should provide acquire and release semantics for that purpose which include the necessary memory barriers for these but the ISC atomic API simply doesn't account for that. Moreover, the sparc64 implementation of the ISC atomic operations is FreeBSD-specific as it's the only OS I'm aware of using the primary instead of the secondary MMU context for the userland (i.e. ASI_P; generally this is a wise choice though), i.e. don't work on the other *BSDs, Linux or Solaris. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c o? sparc64/SMP
On 06/28/2011 08:58, Marius Strobl wrote: Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I haven't seen any verification from the OP that this patch solved the problem, however it did pass 'make universe' on both 9-current and RELENG_8, so I've committed it to those 2 branches along with the recent update. I'll also submit it upstream. Thanks, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/6/28 Marius Strobl mar...@alchemy.franken.de: I'm got a problem with named on FreeBSD-CURRENT/sparc64. Up to 5 times a day it crashes with these messages: 27-Jun-2011 03:42:14.384 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure) I found a some similar problems on alpha and IA64, which was related to problems with isc_atomic_xadd() function in include/isc/atomic.h. But I don't understand that there may be incorrect for sparc64 and this function was not changed for a minimum 4 years... Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff Oh, Marius, You are my savior... I ran named with your patch and and watching him. I think this should be sufficient: cd /usr/src/lib/bind/dns make clean make cd /usr/src/usr.sbin/named make clean make make install (and named's restart) -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
2011/6/29 KOT MATPOCKuH matpoc...@gmail.com: I'm got a problem with named on FreeBSD-CURRENT/sparc64. Up to 5 times a day it crashes with these messages: 27-Jun-2011 03:42:14.384 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure) I found a some similar problems on alpha and IA64, which was related to problems with isc_atomic_xadd() function in include/isc/atomic.h. But I don't understand that there may be incorrect for sparc64 and this function was not changed for a minimum 4 years... Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I ran named with your patch and and watching him. Omg. Or I incorrectly rebuilt named, or the problem is not solved. I got a crash after about 2 hours after named restarted: 29-Jun-2011 13:51:28.855 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure) -- MATPOCKuH ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote: 2011/6/29 KOT MATPOCKuH matpoc...@gmail.com: I'm got a problem with named on FreeBSD-CURRENT/sparc64. Up to 5 times a day it crashes with these messages: 27-Jun-2011 03:42:14.384 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure) I found a some similar problems on alpha and IA64, which was related to problems with isc_atomic_xadd() function in include/isc/atomic.h. But I don't understand that there may be incorrect for sparc64 and this function was not changed for a minimum 4 years... Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I ran named with your patch and and watching him. Omg. Or I incorrectly rebuilt named, or the problem is not solved. I got a crash after about 2 hours after named restarted: 29-Jun-2011 13:51:28.855 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure) The remainder of the isc atomic.h looks fine though, so this likely is a general bug in BIND, especially if it didn't happen before BIND 9.6.-ESV-R4-P1. Doug should be able to help you. Doug, could you please nevertheless take care of getting the above patch into BIND? It's a merge of r148453. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c on sparc64/SMP
On 06/29/2011 06:41, Marius Strobl wrote: On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote: 2011/6/29 KOT MATPOCKuHmatpoc...@gmail.com: I'm got a problem with named on FreeBSD-CURRENT/sparc64. Up to 5 times a day it crashes with these messages: 27-Jun-2011 03:42:14.384 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure) I found a some similar problems on alpha and IA64, which was related to problems with isc_atomic_xadd() function in include/isc/atomic.h. But I don't understand that there may be incorrect for sparc64 and this function was not changed for a minimum 4 years... Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff I ran named with your patch and and watching him. Omg. Or I incorrectly rebuilt named, or the problem is not solved. I got a crash after about 2 hours after named restarted: 29-Jun-2011 13:51:28.855 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure) The remainder of the isc atomic.h looks fine though, so this likely is a general bug in BIND, especially if it didn't happen before BIND 9.6.-ESV-R4-P1. Doug should be able to help you. Doug, could you please nevertheless take care of getting the above patch into BIND? It's a merge of r148453. Hmm, I thought I had already pushed that rock up the appropriate hill, but maybe not. I've been following this thread, but it's incredibly unlikely that I'll be able to do anything useful with it until Friday. hth, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: named crashes on assertion in rbtdb.c o? sparc64/SMP
On Mon, Jun 27, 2011 at 07:19:33PM +0400, KOT MATPOCKuH wrote: Hello! I'm got a problem with named on FreeBSD-CURRENT/sparc64. Up to 5 times a day it crashes with these messages: 27-Jun-2011 03:42:14.384 general: /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614: REQUIRE(prev 0) failed 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure) The problem is still in latest system's bind: # named -v BIND 9.6.-ESV-R4-P1 This problem exists only on SMP sparc64 system. On my another sparc64, with 1 processor, I does not have this problem. I found a some similar problems on alpha and IA64, which was related to problems with isc_atomic_xadd() function in include/isc/atomic.h. But I don't understand that there may be incorrect for sparc64 and this function was not changed for a minimum 4 years... How can I help solve this problem? Uhm, we once fixed a problem in the MD atomic implementation which still seems to present in the ISC copy. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org