Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 09:23:46AM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: okay I use SIGABRT now. If it still not able to generate, I'll try to use gdb generate-core-file command to generate it. One last thing to try: 'sysctl kern.coredump=1', though I don't see why it would be 0. There's not much point in using gdb to generate a core dump, though, as the entire point was to avoid having to attach gdb to the child. kern.coredump is already 1, but I found kern.sugid_coredump in core(5): By default, a process that changes user or group credentials whether real or effective will not create a corefile. This behaviour can be changed to generate a core dump by setting the sysctl(8) variable kern.sugid_coredump to 1. Since varnishd is run by root then setuid to nobody, this rule will apply to. Change kern.sugid_coredump from 0 to 1 then the core file was generated. Core file and execute file is in: http://mail.pixnet.tw/~gslin/tmp/varnishd.core http://mail.pixnet.tw/~gslin/tmp/varnishd -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 09:50:31PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: Core file and execute file is in: http://mail.pixnet.tw/~gslin/tmp/varnishd.core http://mail.pixnet.tw/~gslin/tmp/varnishd That won't help; at the very least, I'd need all the Varnish libraries and access to a system with the exact same system libraries as yours. All files in /home/service/varnish been package in: http://mail.pixnet.tw/~gslin/tmp/varnish-all.tar.gz But you may need *all libraries*, should I create a shell account to let you access ? What I asked for was the output of i thr in gdb. (gdb) i thr 12 Thread 0x53d000 (LWP 100256) 0x000800c3356c in poll () from /lib/libc.so.6 11 Thread 0x53d800 (LWP 100159) 0x000800c6373c in nanosleep () from /lib/libc.so.6 10 Thread 0x53da00 (LWP 100199) 0x000800c6373c in nanosleep () from /lib/libc.so.6 9 Thread 0x53dc00 (LWP 100212) 0x000800c6373c in nanosleep () from /lib/libc.so.6 8 Thread 0xa67d000 (LWP 100224) 0x000800c432fc in kevent () from /lib/libc.so.6 7 Thread 0xa67d200 (LWP 100225) 0x000800c3356c in poll () from /lib/libc.so.6 6 Thread 0xa67d400 (LWP 100226) 0x000800bfe07c in _umtx_op () from /lib/libc.so.6 5 Thread 0xa67d600 (LWP 100227) 0x000800c7bf5c in read () from /lib/libc.so.6 4 Thread 0xa67d800 (LWP 100228) 0x000800c7b926 in memcpy () from /lib/libc.so.6 3 Thread 0xa67da00 (LWP 100229) 0x000800c7b8f4 in memset () from /lib/libc.so.6 2 Thread 0xa67dc00 (LWP 100230) 0x000800bfe07c in _umtx_op () from /lib/libc.so.6 * 1 Thread 0xa67de00 (LWP 100231) 0x000800c7bf5c in read () from /lib/libc.so.6 (gdb) bt #0 0x000800c7bf5c in read () from /lib/libc.so.6 #1 0x000800984fbb in read () from /usr/lib/libthr.so.2 #2 0x0041585e in HTC_Read (htc=0x7e7f2a20, d=0x86dd6c000, len=160763) at cache_httpconn.c:202 #3 0x0041095f in Fetch (sp=0xa681008) at cache_fetch.c:72 #4 0x0040e42b in CNT_Session (sp=0xa681008) at cache_center.c:323 #5 0x00416209 in wrk_thread (priv=0x53e5e0) at cache_pool.c:193 #6 0x00080098729e in pthread_create () from /usr/lib/libthr.so.2 #7 0x in ?? () Cannot access memory at address 0x7e7f5000 (gdb) i fra Stack level 0, frame at 0x7e7f0930: rip = 0x800c7bf5c in read; saved rip 0x800984fbb called by frame at 0x7e7f0960 Arglist at 0x7e7f0920, args: Locals at 0x7e7f0920, Previous frame's sp is 0x7e7f0930 Saved registers: rip at 0x7e7f0928 (gdb) -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: Child said (2, 9802): Assert error in SES_Delete(), cache_session.c line 339: Condition((sp-obj) == 0) not true. errno = 0 (Unknown error: 0) See ticket #162, we're working on it. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 12:19:59PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: Child said (2, 9802): Assert error in SES_Delete(), cache_session.c line 339: Condition((sp-obj) == 0) not true. errno = 0 (Unknown error: 0) See ticket #162, we're working on it. Okay, so I'll upgrade to trunk and try again. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 01:57:36PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: On Thu, Nov 08, 2007 at 12:19:59PM +0100, Dag-Erling Sm???rgrav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: Child said (2, 9802): Assert error in SES_Delete(), cache_session.c line 339: Condition((sp-obj) == 0) not true. errno = 0 (Unknown error: 0) See ticket #162, we're working on it. Okay, so I'll upgrade to trunk and try again. We haven't fixed it yet, so upgrading to trunk won't help. Is that ok to running 32 varnishd and use HAProxy to redirect to them ? We found they will run out of memory. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 05:39:45PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: Is that ok to running 32 varnishd and use HAProxy to redirect to them ? 32 Varnish instances? Sounds like a lot. What kind of site is it? We're trying to serve images since the backend's I/O is not fast enough, so we need cache server to cache them. The original idea is use Varnish with -s file,filename,100GB to cache them, but varnish abort in SES_Delete. So the alternative is to run lots of varnishd and use HAProxy to dispatch the request. We found they will run out of memory. Hmm, you must be running 1.1; try switching to branches/1.2. It shouldn't run out of memory. Here's a patch I think should help with the SES_Delete() issue, BTW. I will commit it (or something similar) once I've confirmed that it actually fixes the bug. Yes we run varnish-1.1 (install from ports), I'll try 1.2 with your patch. Thank you in advance. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: Is that ok to running 32 varnishd and use HAProxy to redirect to them ? 32 Varnish instances? Sounds like a lot. What kind of site is it? We found they will run out of memory. Hmm, you must be running 1.1; try switching to branches/1.2. It shouldn't run out of memory. Here's a patch I think should help with the SES_Delete() issue, BTW. I will commit it (or something similar) once I've confirmed that it actually fixes the bug. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no Index: bin/varnishd/cache_acceptor_kqueue.c === --- bin/varnishd/cache_acceptor_kqueue.c(revision 2228) +++ bin/varnishd/cache_acceptor_kqueue.c(working copy) @@ -66,8 +66,8 @@ if (sp-fd 0) return; EV_SET(ki[nki], sp-fd, EVFILT_READ, arm, 0, 0, sp); - if (++nki == NKEV) { - assert(kevent(kq, ki, nki, NULL, 0, NULL) = 0); + if (++nki == NKEV || arm == EV_DELETE) { + AZ(kevent(kq, ki, nki, NULL, 0, NULL)); nki = 0; } } ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 05:39:45PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: We found they will run out of memory. Hmm, you must be running 1.1; try switching to branches/1.2. It shouldn't run out of memory. Here's a patch I think should help with the SES_Delete() issue, BTW. I will commit it (or something similar) once I've confirmed that it actually fixes the bug. I checkout the latest version in trunk (r2231), apply the patch, and find no object being cached. (telnet to management port and use stats to check) Now I run it without the patch and the server just crash and no information to debug: Child not responding to ping Child not responding to ping Child not responding to ping Cache child died pid=49696 status=0x9 Clean child Child cleaned start child pid 49747 Child said (2, 49747): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 12:32:37AM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: I've tried 4 times, but varnishd still not to generate dump file. Are you sure? It should be in the run-time state directory, most likely $PREFIX/var/varnish/$HOSTNAME. The core file is not in /home/service/var/varnish/testphp.pixnet.tw/, my guess is it's too small so I retry it: * rm -rf /home/service/varnish/var/varnish/testphp.pixnet.tw * ln -s ~/tmp /home/service/varnish/var/varnish/testphp.pixnet.tw * chmod 1777 ~/tmp So I got: [EMAIL PROTECTED] [~/tmp] (7:47) d total 8244 drwxrwxrwt 2 gslin admin 4096 Nov 9 07:40 ./ drwxr-xr-x 10 gslin admin 4096 Nov 9 07:33 ../ -rw-r--r-- 1 root wheel 8389208 Nov 9 07:43 _.vsl -rwxr-xr-x 1 root wheel14534 Nov 9 07:40 bin.LxjNthA8* -rwxr-xr-x 1 gslin admin 260 Nov 9 07:33 varnishd.sh* [EMAIL PROTECTED] [~/tmp] (7:47) cat varnishd.sh #!/bin/sh ulimit -c unlimited /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d [EMAIL PROTECTED] [~/tmp] (7:47) Run it and got SIGQUIT: Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3 Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3 It's console: [EMAIL PROTECTED] [~/tmp] (7:40) sudo ./varnishd.sh storage_file: filename: /home/service/varnish-cache.mmap size 32768 MegaBytes. Classic hash: 1048583 buckets Using old SHMFILE rolling(1)... rolling(2)... start start child pid 76784 200 0 Child said (2, 76784): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready Child not responding to ping Child not responding to ping Child not responding to ping Child not responding to ping Cache child died pid=76784 status=0x3 Clean child Child cleaned start child pid 76793 Child said (2, 76793): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready Child not responding to ping Child not responding to ping Child not responding to ping Child not responding to ping Child not responding to ping Cache child died pid=76793 status=0x3 Clean child Child cleaned start child pid 76794 Child said (2, 76794): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: I can see signal 3 (QUIT) from dmesg: pid 76061 (varnishd), uid 65534: exited on signal 3 pid 76187 (varnishd), uid 65534: exited on signal 3 but I cannot find coredump. I run the following command in /tmp with mode 1777: Read what I wrote earlier, you need to run 'ulimit -c unlimited' before starting Varnish. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: Program received signal SIGKILL, Killed. Killed by the parent again. [Switching to Thread 0xbb4e200 (LWP 101321)] 0x000800c1716f in brk () from /lib/libc.so.6 (gdb) bt #0 0x000800c1716f in brk () from /lib/libc.so.6 #1 0x0c2f1000 in ?? () #2 0x000800c165ba in _UTF8_init () from /lib/libc.so.6 #3 0x000800c167e8 in _UTF8_init () from /lib/libc.so.6 #4 0x000800c170e6 in _UTF8_init () from /lib/libc.so.6 #5 0x000800c7966b in calloc () from /lib/libc.so.6 #6 0x00411262 in HSH_Prealloc (sp=0xa6dd008) at cache_hash.c:80 #7 0x00411875 in HSH_Lookup (sp=0xa6dd008) at cache_hash.c:185 #8 0x0040e890 in CNT_Session (sp=0xa6dd008) at cache_center.c:534 #9 0x00416209 in wrk_thread (priv=0x53e5e0) at cache_pool.c:193 #10 0x00080098729e in pthread_create () from /usr/lib/libthr.so.2 #11 0x in ?? () Error accessing memory address 0x7d1ea000: Bad address. This stack frame is of no interest, it is just the thread that happened to be running when the SIGKILL was delivered. What would be helpful at this point is i thr. As previously mentioned, this will be a lot easier if you modify mgt_child.c to send SIGQUIT instead of SIGKILL so you get a core dump. BTW, I assume this is all on FreeBSD? Which version? DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 10:21:22PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: It crash again: Actually, it doesn't crash; the child get stuck somewhere and the parent kills it. I'd like to see an 'svn diff' of your tree. Also, if you change SIGKILL in bin/varnishd/mgt_child.c to either SIGABRT or SIGQUIT, you should get a core dump of the child process when the parent kills it. If you don't, 'ulimit -c unlimited' and try again. DES [EMAIL PROTECTED] [/tmp] (5:23) svn export ~/svn/varnish/varnish-cache varnish-cache.clean [EMAIL PROTECTED] [/tmp] (5:23) diff -ruN varnish-cache.clean varnish-cache varnish-cache.diff This file is in http://mail.pixnet.tw/~gslin/varnish-cache.diff.gz [EMAIL PROTECTED] [/tmp] (5:26) cd varnish-cache.clean/ [EMAIL PROTECTED] [/tmp/varnish-cache.clean] (5:26) ./autogen.sh + aclocal + libtoolize --copy --force + autoheader + automake --add-missing --copy --foreign configure.ac: installing `./install-sh' configure.ac: installing `./missing' bin/varnishadm/Makefile.am: installing `./compile' bin/varnishadm/Makefile.am: installing `./depcomp' + autoconf [EMAIL PROTECTED] [/tmp/varnish-cache.clean] (5:26) cd .. [EMAIL PROTECTED] [/tmp] (5:26) diff -ruN varnish-cache.clean varnish-cache varnish-cache-2.diff This file is in http://mail.pixnet.tw/~gslin/varnish-cache-2.diff.gz [EMAIL PROTECTED] [/tmp/varnish-cache.clean] (5:29) ./configure --prefix=/home/service/varnish (lots of msgs) [EMAIL PROTECTED] [/tmp] (5:30) diff -ruN varnish-cache.clean varnish-cache varnish-cache-3.diff This file is in http://mail.pixnet.tw/~gslin/varnish-cache-3.diff.gz -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 12:55:57AM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: [EMAIL PROTECTED] [~/tmp] (7:47) cat varnishd.sh #!/bin/sh ulimit -c unlimited /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d [EMAIL PROTECTED] [~/tmp] (7:47) Run it and got SIGQUIT: Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3 Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3 Still no core file? Try SIGABRT instead. If that doesn't work, I'm out of ideas... though you can still attach directly to the child with gdb. okay I use SIGABRT now. If it still not able to generate, I'll try to use gdb generate-core-file command to generate it. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: [EMAIL PROTECTED] [~/tmp] (7:47) cat varnishd.sh #!/bin/sh ulimit -c unlimited /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d [EMAIL PROTECTED] [~/tmp] (7:47) Run it and got SIGQUIT: Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3 Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3 Still no core file? Try SIGABRT instead. If that doesn't work, I'm out of ideas... though you can still attach directly to the child with gdb. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 08:02:36AM +0800, Gea-Suan Lin wrote: On Fri, Nov 09, 2007 at 12:55:57AM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: [EMAIL PROTECTED] [~/tmp] (7:47) cat varnishd.sh #!/bin/sh ulimit -c unlimited /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d [EMAIL PROTECTED] [~/tmp] (7:47) Run it and got SIGQUIT: Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3 Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3 Still no core file? Try SIGABRT instead. If that doesn't work, I'm out of ideas... though you can still attach directly to the child with gdb. okay I use SIGABRT now. If it still not able to generate, I'll try to use gdb generate-core-file command to generate it. I got this: (gdb) generate-core-file Couldn't open /proc/1005/map I'll mount procfs and try again. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 10:37:58PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: [EMAIL PROTECTED] [/tmp] (5:23) svn export ~/svn/varnish/varnish-cache varnish-cache.clean [EMAIL PROTECTED] [/tmp] (5:23) diff -ruN varnish-cache.clean varnish-cache varnish-cache.diff This is *really* not useful. Please give me a simple 'svn diff', or at least drop the -N. I didn't run autogen.sh configure make in working repository, but svn export varnish-cache /tmp/varnish-cache. This is diff of bin/varnishd/cache_acceptor_kqueue.c: [EMAIL PROTECTED] [/tmp/varnish-cache/bin/varnishd] (5:41) diff -ruN ~/svn/varnish/varnish-cache/bin/varnishd/cache_acceptor_kqueue.c cache_acceptor_kqueue.c --- /net/account/admin/gslin/svn/varnish/varnish-cache/bin/varnishd/cache_acceptor_kqueue.c Fri Nov 9 01:38:12 2007 +++ cache_acceptor_kqueue.c Fri Nov 9 03:43:26 2007 @@ -66,8 +66,8 @@ if (sp-fd 0) return; EV_SET(ki[nki], sp-fd, EVFILT_READ, arm, 0, 0, sp); - if (++nki == NKEV) { - assert(kevent(kq, ki, nki, NULL, 0, NULL) = 0); + if (++nki == NKEV || arm == EV_DELETE) { + AZ(kevent(kq, ki, nki, NULL, 0, NULL)); nki = 0; } } An update of patched varnishd: I use gdb to attach it and it looks fine now, this is varnishstat screenshot, now it can store 5GB. I'll try to run without gdb later. 0+00:27:23 Hitrate ratio: 10 100 1000 Hitrate avg: 0.0933 0.2071 0.2500 3083113.4218.77 Client connections accepted 3344114.3820.35 Client requests received 5005 0.96 3.05 Cache hits 0 0.00 0.00 Cache hits for pass 2840514.3817.29 Cache misses 2843614.3817.31 Backend connections success 0 0.00 0.00 Backend connections failures 0 0.00 0.00 Backend connections reuses 0 0.00 0.00 Backend connections recycles 6-0.96 0.00 Backend connections unused 24 .. N struct srcaddr 4 .. N active struct srcaddr 23 .. N struct sess_mem 5 .. N struct sess 28415 .. N struct object 28415 .. N struct objecthead 56817 .. N struct smf 0 .. N small free smf 1 .. N large free smf 0 .. N struct vbe_conn 17 .. N worker threads 17 0.00 0.01 N worker threads created 0 0.00 0.00 N worker threads not created 0 0.00 0.00 N worker threads limited 0 0.00 0.00 N queued work requests 17 0.00 0.01 N overflowed work requests 0 0.00 0.00 N dropped work requests 0 .. N expired objects 0 .. N LRU nuked objects 0 .. N LRU saved objects 0 .. N objects on deathrow 0 0.00 0.00 HTTP header overflows 0 0.00 0.00 Objects sent with sendfile 3159814.3819.23 Objects sent with write 3082913.4218.76 Total Sessions 3344014.3820.35 Total Requests 31 0.00 0.02 Total pipe 0 0.00 0.00 Total pass 2840413.4217.29 Total fetch 7942018 3464.68 4833.85 Total header bytes 5805632005 1854165.04 3533555.69 Total body bytes 3034513.4218.47 Session Closed 0 0.00 0.00 Session Pipeline 0 0.00 0.00 Session Read Ahead 3109 0.96 1.89 Session herd 2058773 926.09 1253.06 SHM records 12701256.5677.30 SHM writes 13 0.00 0.01 SHM MTX contention 5681626.8434.58 allocator requests 56816 .. outstanding allocations 5530710016 .. bytes allocated 28829028352 .. bytes free 2840414.3817.29 Backend requests made -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 09:16:22PM +0100, Dag-Erling Sm鷨grav wrote: The correct way to apply a patch is with /usr/bin/patch. Yes I do patch using system's patch this time. -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: When I want to try it again, I found it might be my fault :-) http://varnish.projects.linpro.no/attachment/ticket/162/cache_acceptor_kqueue.diff It should be patched to: AZ(kevent(kq, ki, nki, NULL, 0, NULL)); But I patched to: AZ(kevent(kq, ki, nki, NULL, 0, NULL) = 0); I'll try it again. The correct way to apply a patch is with /usr/bin/patch. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: I checkout the latest version in trunk (r2231), apply the patch, and find no object being cached. (telnet to management port and use stats to check) I'm not psychic, you'll have to actually *show me* the stats. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Fri, Nov 09, 2007 at 03:44:57AM +0800, Gea-Suan Lin wrote: On Thu, Nov 08, 2007 at 08:40:23PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: I checkout the latest version in trunk (r2231), apply the patch, and find no object being cached. (telnet to management port and use stats to check) I'm not psychic, you'll have to actually *show me* the stats. When I want to try it again, I found it might be my fault :-) http://varnish.projects.linpro.no/attachment/ticket/162/cache_acceptor_kqueue.diff It should be patched to: AZ(kevent(kq, ki, nki, NULL, 0, NULL)); But I patched to: AZ(kevent(kq, ki, nki, NULL, 0, NULL) = 0); I'll try it again. It crash again: [EMAIL PROTECTED] [~] (3:45) sudo /usr/bin/env -i /home/service/varnish/sbin/varnishd -a [our ip address]:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d storage_file: filename: /home/service/varnish-cache.mmap size 32768 MegaBytes. Classic hash: 1048583 buckets Using old SHMFILE rolling(1)... rolling(2)... start start child pid 57305 200 0 Child said (2, 57305): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready Child not responding to ping Child not responding to ping Child not responding to ping Cache child died pid=57305 status=0x9 Clean child Child cleaned start child pid 58543 Child said (2, 58543): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready Child not responding to ping Child not responding to ping Cache child died pid=58543 status=0x9 Clean child Child cleaned start child pid 58544 Child said (2, 58544): Child starts sizeof(struct ws) = 48 sizeof(struct http) = 584 sizeof(struct http_conn) = 48 sizeof(struct acct) = 64 sizeof(struct worker) = 1232 sizeof(struct workreq) = 24 sizeof(struct bereq) = 656 sizeof(struct storage) = 72 sizeof(struct object) = 824 sizeof(struct objhead) = 56 sizeof(struct sess) = 448 sizeof(struct vbe_conn) = 48 sizeof(struct backend) = 88 managed to mmap 34359738368 bytes of 34359738368 Ready CLI ready -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: [EMAIL PROTECTED] [/tmp] (5:23) svn export ~/svn/varnish/varnish-cache varnish-cache.clean [EMAIL PROTECTED] [/tmp] (5:23) diff -ruN varnish-cache.clean varnish-cache varnish-cache.diff This is *really* not useful. Please give me a simple 'svn diff', or at least drop the -N. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
On Thu, Nov 08, 2007 at 10:21:22PM +0100, Dag-Erling Sm鷨grav wrote: Gea-Suan Lin [EMAIL PROTECTED] writes: It crash again: Actually, it doesn't crash; the child get stuck somewhere and the parent kills it. I'd like to see an 'svn diff' of your tree. Also, if you change SIGKILL in bin/varnishd/mgt_child.c to either SIGABRT or SIGQUIT, you should get a core dump of the child process when the parent kills it. If you don't, 'ulimit -c unlimited' and try again. I can see signal 3 (QUIT) from dmesg: pid 76061 (varnishd), uid 65534: exited on signal 3 pid 76187 (varnishd), uid 65534: exited on signal 3 but I cannot find coredump. I run the following command in /tmp with mode 1777: sudo /usr/bin/env -i /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 And the default coredumpsize is unlimited now. [EMAIL PROTECTED] [~] (6:29) limits Resource limits (current): cputime infinity secs filesize infinity kB datasize 33554432 kB stacksize 524288 kB coredumpsize infinity kB memoryuseinfinity kB memorylocked infinity kB maxprocesses 5547 openfiles 524288 sbsize infinity bytes vmemoryuse infinity kB [EMAIL PROTECTED] [~] (6:29) sudo env -i /usr/bin/limits Resource limits (current): cputime infinity secs filesize infinity kB datasize 33554432 kB stacksize 524288 kB coredumpsize infinity kB memoryuseinfinity kB memorylocked infinity kB maxprocesses 5547 openfiles 524288 sbsize infinity bytes vmemoryuse infinity kB Suggestion ? -- * Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search) * If you cannot convince them, confuse them. -- Harry S Truman ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crash (SIGABRT) about every 10 mins
Gea-Suan Lin [EMAIL PROTECTED] writes: I've tried 4 times, but varnishd still not to generate dump file. Are you sure? It should be in the run-time state directory, most likely $PREFIX/var/varnish/$HOSTNAME. DES -- Dag-Erling Smørgrav Senior Software Developer Linpro AS - www.linpro.no ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc