We're seeing crashes in powerdns 2.9.22 when calling "pdns_control rediscover". We have a cron job that does this, currently twice an hour, and on average about once a day it results in a crash, looking like this in /var/log/messages:
Aug 11 12:10:42 ns1 pdns[1980]: Got a signal 6, attempting to print trace: Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance [0x80cb5e4] Aug 11 12:10:42 ns1 pdns[1980]: [0x110420] Aug 11 12:10:42 ns1 pdns[1980]: [0x110410] Aug 11 12:10:42 ns1 pdns[1980]: /lib/libc.so.6(gsignal+0x50) [0x179df0] Aug 11 12:10:42 ns1 pdns[1980]: /lib/libc.so.6(abort+0x101) [0x17b701] Aug 11 12:10:42 ns1 pdns[1980]: /lib/libc.so.6(__assert_fail+0xfb) [0x17326b] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_ZN12Bind2Backend6insertEN5boost10shared_ptrINS_5StateEEEiRKSsRK5QTypeS5_ii+0x847) [0x81151d7] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_ZN12Bind2Backend10loadConfigEPSs+0x8c6) [0x8119e06] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_ZN12UeberBackend10rediscoverEPSs+0x38) [0x80d77f8] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_Z19DLRediscoverHandlerRKSt6vectorISsSaISsEEi+0xcd) [0x80e31bd] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_ZN11DynListener11theListenerEv+0x5c0) [0x80dded0] Aug 11 12:10:42 ns1 pdns[1980]: /usr/sbin//pdns_server-instance(_ZN11DynListener17theListenerHelperEPv+0x11) [0x80decb1] Aug 11 12:10:42 ns1 pdns[1980]: /lib/libpthread.so.0 [0x7a373b] Aug 11 12:10:42 ns1 pdns[1980]: /lib/libc.so.6(clone+0x5e) [0x222cfe] We have about 640000 domains in total, with typically up to about 50 new ones each time the cron job runs. They are all slave zones (from a non-public master). We're typically getting about 700 queries per second at peak times. The crashes are sometimes at busy times, sometimes not, with no apparent correlation to anything else that I know of (although of course the sample size is not huge). We've compiled with --with-modules="" because we don't run any backends other than bind; the box is stock Red Hat Enterprise 5.3 with boost 1.33.1, as shipped by Red Hat. These crashes have been seen on two different boxes with the same setup, so I don't think it can be a hardware fault; we first saw them running the pdns-static rpm as downloaded from powerdns.com and there's been no change now we're running our own build (except the stack trace is more informative because it now has symbols in it). Does anyone have any suggestions? What should I do next to diagnose the problem? Is this something anyone else has seen? We are getting it on both of our publicly visible nameservers so we're having customer-visible problems an average of twice a day with a non-negligible chance of losing both nameservers simultaneously and my boss is going to tell me to go back to running bind sooner or later. :( Thanks, Richard -- Richard Poole System Administrator Heart Internet Ltd [email protected] http://www.heartinternet.co.uk/ Tel: 0845 644 7750 Fax: 0845 644 7740 ****************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. Heart Internet Ltd accepts no responsibility for information, errors or omissions in this email. ****************************************************************** _______________________________________________ Pdns-users mailing list [email protected] http://mailman.powerdns.com/mailman/listinfo/pdns-users
