Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-17 Thread Marius Strobl
On Sat, Jul 16, 2011 at 10:42:22PM -0700, Doug Barton wrote:
 On 07/15/2011 01:40, Marius Strobl wrote:
 
  The generated config.h and platform.h for sparc64 are these:
  http://people.freebsd.org/~marius/bind96_config.h
  http://people.freebsd.org/~marius/bind96_platform.h
 
 Marius,
 
 Thanks again for all your help on this. During the work to upgrade to
 BIND 9.8 in HEAD I first tried your patch but I got some odd errors on
 some of the non-mainstream archs, so I ultimately went with something
 similar to what you sent but much more conservative.
 

Thanks!

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-16 Thread Doug Barton
On 07/15/2011 01:40, Marius Strobl wrote:

 The generated config.h and platform.h for sparc64 are these:
 http://people.freebsd.org/~marius/bind96_config.h
 http://people.freebsd.org/~marius/bind96_platform.h

Marius,

Thanks again for all your help on this. During the work to upgrade to
BIND 9.8 in HEAD I first tried your patch but I got some odd errors on
some of the non-mainstream archs, so I ultimately went with something
similar to what you sent but much more conservative.


Doug

-- 

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-15 Thread Marius Strobl
On Thu, Jul 14, 2011 at 05:31:49PM -0700, Doug Barton wrote:
 On 07/14/2011 16:21, Marius Strobl wrote:
  On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
  2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
  Oops, sorry, I forgot to revert the previous patch when test-compiling.
  Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
  I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
  and it worked properly till Sun Jul 10 22:25:41 MSD.
  At 22:25:41 I restarted bind from base system with your
  sparc64_isc_atomic.h.diff2.
  From this moment till today, 15:57:05 he crashed 3 times:
  Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on 
  signal 6
  Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on 
  signal 6
  Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on 
  signal 6
 
  To make to ensure proper operation of bind from ports, I ran it again
  at 15:57:05, and, I think, we need to wait several days.
  And from that time till now bind from ports never died and works 
  properly...
 
  
  Okay.
  Doug, could you please disable the use of atomic operations for sparc64
  in the in-tree BIND via the following patch in order to match what the
  vendor source does?
  http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
 
 If you use the port and do 'make configure' are the values in config.h
 the same as the ones in your patch?  If so, that's likely to be the
 right answer, and I'll go ahead and apply your patch.
 

The generated config.h and platform.h for sparc64 are these:
http://people.freebsd.org/~marius/bind96_config.h
http://people.freebsd.org/~marius/bind96_platform.h

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-14 Thread Marius Strobl
On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
  Oops, sorry, I forgot to revert the previous patch when test-compiling.
  Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
  I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
  and it worked properly till Sun Jul 10 22:25:41 MSD.
  At 22:25:41 I restarted bind from base system with your
  sparc64_isc_atomic.h.diff2.
  From this moment till today, 15:57:05 he crashed 3 times:
  Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 
  6
  Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 
  6
  Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 
  6
 
  To make to ensure proper operation of bind from ports, I ran it again
  at 15:57:05, and, I think, we need to wait several days.
 And from that time till now bind from ports never died and works properly...
 

Okay.
Doug, could you please disable the use of atomic operations for sparc64
in the in-tree BIND via the following patch in order to match what the
vendor source does?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
I've no idea why they don't work properly (apart from the fact that there
additionally should be memory barriers at least when used for reference
counting just like the alpha version of the ISC atomic operations uses),
I just can say they match what we use in the kernel without problems
pretty closely and that they work as described in the respective comments
when testing them stand-alone. So my best guess is that the BIND source
additionaly depends on some x86-specific behavior of the atomic operations
there or in general, but from a glance the source it's not obvious for me
what that could be. Given that the vendor source doesn't even use atomic
operations on Solaris/SPARC I suspect this is a non-trivial problem.
It probably would be a good idea to also disable the use of atomic
operations for arm again just like the vendor source does as they don't
work there either but nobody seems to care (see PR 154306).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-14 Thread Doug Barton
On 07/14/2011 16:21, Marius Strobl wrote:
 On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
 Oops, sorry, I forgot to revert the previous patch when test-compiling.
 Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
 I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
 and it worked properly till Sun Jul 10 22:25:41 MSD.
 At 22:25:41 I restarted bind from base system with your
 sparc64_isc_atomic.h.diff2.
 From this moment till today, 15:57:05 he crashed 3 times:
 Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 
 6
 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 
 6
 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 
 6

 To make to ensure proper operation of bind from ports, I ran it again
 at 15:57:05, and, I think, we need to wait several days.
 And from that time till now bind from ports never died and works properly...

 
 Okay.
 Doug, could you please disable the use of atomic operations for sparc64
 in the in-tree BIND via the following patch in order to match what the
 vendor source does?
 http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff

If you use the port and do 'make configure' are the values in config.h
the same as the ones in your patch?  If so, that's likely to be the
right answer, and I'll go ahead and apply your patch.


Thanks,

Doug

-- 

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-13 Thread KOT MATPOCKuH
2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
 Oops, sorry, I forgot to revert the previous patch when test-compiling.
 Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
 I started named from ports (dns/bind96) at Sat Jul  9 10:08:41 MSD,
 and it worked properly till Sun Jul 10 22:25:41 MSD.
 At 22:25:41 I restarted bind from base system with your
 sparc64_isc_atomic.h.diff2.
 From this moment till today, 15:57:05 he crashed 3 times:
 Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6
 Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6
 Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6

 To make to ensure proper operation of bind from ports, I ran it again
 at 15:57:05, and, I think, we need to wait several days.
And from that time till now bind from ports never died and works properly...

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-11 Thread KOT MATPOCKuH
2011/7/8 Marius Strobl mar...@alchemy.franken.de:

 In order to have a result which can be compared with the base BIND.
 Whether bind98 works or works without the ISC atomic operations says
 nothing about the bind96 port or the base version.
Okey...

 Oops, sorry, I forgot to revert the previous patch when test-compiling.
 Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
I started named from ports (dns/bind96) at Sat Jul  9 10:08:41 MSD,
and it worked properly till Sun Jul 10 22:25:41 MSD.
At 22:25:41 I restarted bind from base system with your
sparc64_isc_atomic.h.diff2.
From this moment till today, 15:57:05 he crashed 3 times:
Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6
Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6
Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6

To make to ensure proper operation of bind from ports, I ran it again
at 15:57:05, and, I think, we need to wait several days.

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread KOT MATPOCKuH
2011/7/7 Marius Strobl mar...@alchemy.franken.de:
 That's not the patch I was referring to. I did a second one which just
 entirely disables the use of atomic operations on sparc64:
 http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
Omg. I'm sorry.
I applied this patch and restarted named, but named crashed immediatly
after start:
08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads
08-Jul-2011 15:29:54.633 using up to 4096 sockets
Segmentation fault (core dumped)

core's backtrace:
#0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
(gdb) bt
#0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
#1  0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7
#2  0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7
#3  0x409537ac in __sparc_utrap_install () from /lib/libc.so.7
#4  0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3
#5  0x00228dcc in ?? ()
Previous frame identical to this frame (corrupt stack?)

Could this be a sign to a problem in libthr?

PS.
Also one month ago I got a problems with another multithreaded
application from ports (www/oops). oops was crashed with stack's
backtrace:
#0  0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7
#1  0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7
#2  0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7
#3  0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7
#4  0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7
#5  0x40c039b8 in fork () from /lib/libthr.so.3
#6  0x40c03d38 in fork () from /lib/libthr.so.3
#7  0x40c03f50 in pthread_exit () from /lib/libthr.so.3
#8  0x40c04414 in pthread_detach () from /lib/libthr.so.3
#9  0x40c04710 in pthread_create () from /lib/libthr.so.3

But on yesterday's world's build oops works properly. I think it may
be related to r223228 (?)
Or I incorrectly analyze stack for multithreaded applications?

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl
On Fri, Jul 08, 2011 at 03:47:08PM +0400, KOT MATPOCKuH wrote:
 2011/7/7 Marius Strobl mar...@alchemy.franken.de:
  That's not the patch I was referring to. I did a second one which just
  entirely disables the use of atomic operations on sparc64:
  http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
 Omg. I'm sorry.
 I applied this patch and restarted named, but named crashed immediatly
 after start:
 08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads
 08-Jul-2011 15:29:54.633 using up to 4096 sockets
 Segmentation fault (core dumped)
 
 core's backtrace:
 #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
 (gdb) bt
 #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
 #1  0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7
 #2  0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7
 #3  0x409537ac in __sparc_utrap_install () from /lib/libc.so.7
 #4  0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3
 #5  0x00228dcc in ?? ()
 Previous frame identical to this frame (corrupt stack?)
 
 Could this be a sign to a problem in libthr?

Could be but IMO that's unlikely, if there'd be a bug affecting
pthread_mutex_lock() there should be more fallout from that. I'm probably
missing something how to properly disable the use of the ISC atomic
implementation and to enable the alternative locking.
Please try the following:
a) Instead of the base BIND use the dns/bind96 port. The native build
   of the latter defaults to not using the ISC atomic implementation
   on sparc64 (and arm) and should properly enable the alternative. I
   can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
   configuration on -CURRENT without problems.
b) Revert the above patch and try the base bind with the following
   (third) patch:
   http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
   That one adds the memory barriers required for reference counting
   albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
   allow to distinguish between acquire and release semantics.

 
 PS.
 Also one month ago I got a problems with another multithreaded
 application from ports (www/oops). oops was crashed with stack's
 backtrace:
 #0  0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7
 #1  0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7
 #2  0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7
 #3  0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7
 #4  0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7
 #5  0x40c039b8 in fork () from /lib/libthr.so.3
 #6  0x40c03d38 in fork () from /lib/libthr.so.3
 #7  0x40c03f50 in pthread_exit () from /lib/libthr.so.3
 #8  0x40c04414 in pthread_detach () from /lib/libthr.so.3
 #9  0x40c04710 in pthread_create () from /lib/libthr.so.3
 
 But on yesterday's world's build oops works properly. I think it may
 be related to r223228 (?)

Unlikely, the crash caused by the assertion in _malloc_thread_cleanup()
was solved with r223369.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread KOT MATPOCKuH
2011/7/8 Marius Strobl mar...@alchemy.franken.de:

 Please try the following:
 a) Instead of the base BIND use the dns/bind96 port. The native build
   of the latter defaults to not using the ISC atomic implementation
   on sparc64 (and arm) and should properly enable the alternative. I
   can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
   configuration on -CURRENT without problems.
dns/bind96? Why not bind98?
As I see dns/bind98 configures without atomic swap operations.
I will try to use dns/bind98 at first :)

 b) Revert the above patch and try the base bind with the following
   (third) patch:
   http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
   That one adds the memory barriers required for reference counting
   albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
   allow to distinguish between acquire and release semantics.

Hmmm... With this patch build fails:
root@sunrise:/usr/src/lib/bind/dns# make
cc -O2 -pipe  -DVERSION='9.6.-ESV-R4-P3' -DHAVE_CONFIG_H
-D_REENTRANT -D_THREAD_SAFE -DLIBINTERFACE=59 -DLIBREVISION=5
-DLIBAGE=1 -DOPENSSL -DUSE_MD5 -DWORDS_BIGENDIAN
-DNS_LOCALSTATEDIR='/var' -DNS_SYSCONFDIR='/etc/namedb'
-DNAMED_CONFFILE='/etc/namedb/named.conf'
-DRNDC_CONFFILE='/etc/namedb/rndc.conf'
-DRNDC_KEYFILE='/etc/namedb/rndc.key' -I/usr/src/lib/bind/dns/..
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/bind9/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include/dst
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include
-I/usr/src/lib/bind/dns/../dns
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isccc/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isccfg/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/unix/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/pthreads/include
 -I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/include
-I/usr/src/lib/bind/dns/../isc
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/lwres/unix/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/lwres/include
-I/usr/src/lib/bind/dns/../lwres
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include/dst
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/include
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns
-I/usr/src/lib/bind/dns
-I/usr/src/lib/bind/dns/../../../contrib/bind9/lib/isc/sparc64/include
-std=gnu99  -c /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/acache.c
{standard input}: Assembler messages:
{standard input}:13: Error: Illegal operands: invalid membar mask name
{standard input}:2180: Error: Illegal operands: invalid membar mask name
*** Error code 1

 Unlikely, the crash caused by the assertion in _malloc_thread_cleanup()
 was solved with r223369.
Thanks you anyway!

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl
On Fri, Jul 08, 2011 at 11:17:20PM +0400, KOT MATPOCKuH wrote:
 2011/7/8 Marius Strobl mar...@alchemy.franken.de:
 
  Please try the following:
  a) Instead of the base BIND use the dns/bind96 port. The native build
  ? of the latter defaults to not using the ISC atomic implementation
  ? on sparc64 (and arm) and should properly enable the alternative. I
  ? can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
  ? configuration on -CURRENT without problems.
 dns/bind96? Why not bind98?

In order to have a result which can be compared with the base BIND.
Whether bind98 works or works without the ISC atomic operations says
nothing about the bind96 port or the base version.

 As I see dns/bind98 configures without atomic swap operations.
 I will try to use dns/bind98 at first :)
 
  b) Revert the above patch and try the base bind with the following
  ? (third) patch:
  ? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
  ? That one adds the memory barriers required for reference counting
  ? albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
  ? allow to distinguish between acquire and release semantics.
 
 Hmmm... With this patch build fails:

Oops, sorry, I forgot to revert the previous patch when test-compiling.
Please re-fetch sparc64_isc_atomic.h.diff2 and try again.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread KOT MATPOCKuH
I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
but problem is still exists:
07-Jul-2011 13:24:22.765 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
REQUIRE(prev  0) failed
07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)

How can I find root cause of the problem?

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl
On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
 I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
 but problem is still exists:
 07-Jul-2011 13:24:22.765 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
 REQUIRE(prev  0) failed
 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
 
 How can I find root cause of the problem?
 

From your description it's unclear whether you've built BIND with or
without sparc64_isc_disable_atomic.diff. If it was built without that
patch please give it a try. If you had applied it then this apparently
is a generic bug in BIND and unrelated to the MD atomic implementation
and I don't know how to proceed in order to get that fixed. Hopefully
Doug can help you in that case.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread KOT MATPOCKuH
2011/7/7 Marius Strobl mar...@alchemy.franken.de:
 On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
 I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
 but problem is still exists:
 07-Jul-2011 13:24:22.765 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
 REQUIRE(prev  0) failed
 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)

 How can I find root cause of the problem?
 From your description it's unclear whether you've built BIND with or
 without sparc64_isc_disable_atomic.diff. If it was built without that
 patch please give it a try.
As You can see, Doug is already included your patch in head:
http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395r2=223811
And, of course, bind builded with your patch...

 If you had applied it then this apparently
 is a generic bug in BIND and unrelated to the MD atomic implementation
 and I don't know how to proceed in order to get that fixed. Hopefully
 Doug can help you in that case.
Okey, I look forward to for guidance from Doug...

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl
On Thu, Jul 07, 2011 at 03:44:32PM +0400, KOT MATPOCKuH wrote:
 2011/7/7 Marius Strobl mar...@alchemy.franken.de:
  On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
  I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
  but problem is still exists:
  07-Jul-2011 13:24:22.765 general:
  /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
  REQUIRE(prev  0) failed
  07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
 
  How can I find root cause of the problem?
  From your description it's unclear whether you've built BIND with or
  without sparc64_isc_disable_atomic.diff. If it was built without that
  patch please give it a try.
 As You can see, Doug is already included your patch in head:
 http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395r2=223811
 And, of course, bind builded with your patch...
 

That's not the patch I was referring to. I did a second one which just
entirely disables the use of atomic operations on sparc64:
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl
On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
 On 06/28/2011 08:58, Marius Strobl wrote:
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
 
 I haven't seen any verification from the OP that this patch solved the 
 problem,

It simply doesn't so apparently there's another bug in other parts of
BIND causing it to trip over that assertion. Still, the clobber lists
of the sparc64 atomic bits were incomplete and fixing that IMO was the
right thing to do.

 however it did pass 'make universe' on both 9-current and 
 RELENG_8, so I've committed it to those 2 branches along with the recent 
 update. I'll also submit it upstream.
 

Thanks!
Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl
On Wed, Jul 06, 2011 at 11:55:15AM +0200, Marius Strobl wrote:
 On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
  On 06/28/2011 08:58, Marius Strobl wrote:
  Uhm, we once fixed a problem in the MD atomic implementation which
  still seems to present in the ISC copy. Could you please test whether
  the following patch makes a difference?
  http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
  
  I haven't seen any verification from the OP that this patch solved the 
  problem,
 
 It simply doesn't so apparently there's another bug in other parts of
 BIND causing it to trip over that assertion. Still, the clobber lists
 of the sparc64 atomic bits were incomplete and fixing that IMO was the
 right thing to do.
 

MATPOCKuH, could you please test the following patch?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
That one simple disables the use of atomic operations for sparc64 as
I doubt that these have seen much testing except on x86, be it on
sparc64 or in general; given that they are also used for reference
counting they should provide acquire and release semantics for that
purpose which include the necessary memory barriers for these but the
ISC atomic API simply doesn't account for that. Moreover, the sparc64
implementation of the ISC atomic operations is FreeBSD-specific as it's
the only OS I'm aware of using the primary instead of the secondary MMU
context for the userland (i.e. ASI_P; generally this is a wise choice
though), i.e. don't work on the other *BSDs, Linux or Solaris.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-05 Thread Doug Barton

On 06/28/2011 08:58, Marius Strobl wrote:

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff


I haven't seen any verification from the OP that this patch solved the 
problem, however it did pass 'make universe' on both 9-current and 
RELENG_8, so I've committed it to those 2 branches along with the recent 
update. I'll also submit it upstream.



Thanks,

Doug

--

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread KOT MATPOCKuH
2011/6/28 Marius Strobl mar...@alchemy.franken.de:

 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)

 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
Oh, Marius, You are my savior...
I ran named with your patch and and watching him.

I think this should be sufficient:
cd /usr/src/lib/bind/dns
make clean
make
cd /usr/src/usr.sbin/named
make clean
make
make install
(and named's restart)

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread KOT MATPOCKuH
2011/6/29 KOT MATPOCKuH matpoc...@gmail.com:
 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)

 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff

 I ran named with your patch and and watching him.
Omg.
Or I incorrectly rebuilt named, or the problem is not solved.
I got a crash after about 2 hours after named restarted:
29-Jun-2011 13:51:28.855 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Marius Strobl
On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:
 2011/6/29 KOT MATPOCKuH matpoc...@gmail.com:
  I'm got a problem with named on FreeBSD-CURRENT/sparc64.
  Up to 5 times a day it crashes with these messages:
  27-Jun-2011 03:42:14.384 general:
  /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
  REQUIRE(prev  0) failed
  27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
 
  I found a some similar problems on alpha and IA64, which was related
  to problems with isc_atomic_xadd() function in include/isc/atomic.h.
  But I don't understand that there may be incorrect for sparc64 and
  this function was not changed for a minimum 4 years...
  Uhm, we once fixed a problem in the MD atomic implementation which
  still seems to present in the ISC copy. Could you please test whether
  the following patch makes a difference?
  http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
 
  I ran named with your patch and and watching him.
 Omg.
 Or I incorrectly rebuilt named, or the problem is not solved.
 I got a crash after about 2 hours after named restarted:
 29-Jun-2011 13:51:28.855 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)
 

The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Doug Barton

On 06/29/2011 06:41, Marius Strobl wrote:

On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:

2011/6/29 KOT MATPOCKuHmatpoc...@gmail.com:

I'm got a problem with named on FreeBSD-CURRENT/sparc64.
Up to 5 times a day it crashes with these messages:
27-Jun-2011 03:42:14.384 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)



I found a some similar problems on alpha and IA64, which was related
to problems with isc_atomic_xadd() function in include/isc/atomic.h.
But I don't understand that there may be incorrect for sparc64 and
this function was not changed for a minimum 4 years...

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff



I ran named with your patch and and watching him.

Omg.
Or I incorrectly rebuilt named, or the problem is not solved.
I got a crash after about 2 hours after named restarted:
29-Jun-2011 13:51:28.855 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)



The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.


Hmm, I thought I had already pushed that rock up the appropriate hill, 
but maybe not. I've been following this thread, but it's incredibly 
unlikely that I'll be able to do anything useful with it until Friday.



hth,

Doug

--

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-06-28 Thread Marius Strobl
On Mon, Jun 27, 2011 at 07:19:33PM +0400, KOT MATPOCKuH wrote:
 Hello!
 
 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
 
 The problem is still in latest system's bind:
 # named -v
 BIND 9.6.-ESV-R4-P1
 
 This problem exists only on SMP sparc64 system. On my another sparc64,
 with 1 processor, I does not have this problem.
 
 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 
 How can I help solve this problem?
 

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org