Re: [squid-users] Squid loops on epoll/read/gettimeofday
fre 2010-04-02 klockan 19:00 -0300 skrev Marcus Kool: > > I use this Squid at home and it lightly used so maybe it does have > the same FDs allocated after a restart... Seems not. So we have to wait until you see the problem again to match up with cachemgr filedescriptor page. Regards Henrik
Re: [squid-users] Squid loops on epoll/read/gettimeofday
Henrik Nordström wrote: fre 2010-04-02 klockan 15:41 -0300 skrev Marcus Kool: strange indeed, but this is strace output with which I am not very familiar. Strace should print the whole array that it uses as argument to epoll_wait or just prints the first element ? (and the 2nd argument could be FD 27) For epoll_wait it prints the valid entries in the result array, which has length 1 in this case (see return value last on the line). Can you please use cachemgr to view the current filedescriptors page to note what fd 23 & fd 27 is being used for? I restarted Squid and now it does not loop... The lsof output in the original email is of little help: A little. Both are pipes. And it's not unlikely these pipes have the same fd assignments now after restart. What does cachemgr say now? I use this Squid at home and it lightly used so maybe it does have the same FDs allocated after a restart... Cachemgr gives this (after the restart): Active file descriptors: File Type Tout Nread * Nwrite * Remote AddressDescription -- - -- 5 Log 0 00 /local/squid/logs/cache.log 7 Socket01275 241 .0DNS Socket 8 Pipe 0 63 9934ufdbgclient #1 9 Pipe 0 0* 0ufdbgclient #2 10 Pipe 0 0* 0ufdbgclient #3 11 Pipe 0 0* 0ufdbgclient #4 12 Pipe 0 0* 0ufdbgclient #5 13 Pipe 0 0* 0ufdbgclient #6 14 Pipe 0 0* 0ufdbgclient #7 15 Pipe 0 0* 0ufdbgclient #8 16 Pipe 0 0* 0ufdbgclient #9 17 Pipe 0 0* 0ufdbgclient #10 18 Pipe 0 0* 0ufdbgclient #11 19 Pipe 0 0* 0ufdbgclient #12 20 File 0 0 8242 /local/squid/logs/access.log 21 Pipe 0 00unlinkd -> squid 22 File 0 0 312 /local/squid/cache/swap.state 23 Socket 1440 848* 0 10.2.2.9.56505 http://srv004.home01.urlfilterdb.com/Squid/cgi-bin/cachemgr.cgi 24 Pipe 0 00squid -> unlinkd 25 Socket0 00 .0HTTP Socket 26 Socket2 0*1794 10.2.2.9.80 http://srv004.home01.urlfilterdb.com/Squid/cgi-bin/cachemgr.cgi 27 Socket 1440 170* 0* 127.0.0.1.35576 cache_object://localhost/filedescriptors 28 Pipe 0 0* 0async-io completetion event: main 29 Pipe 0 00async-io completetion event: threads squid 13665 squid 20w REG9,3 1478687 4718702 /local/squid/logs/access.log squid 13665 squid 21r FIFO0,6 1723502 pipe squid 13665 squid 22w REG9,3 30400812 4718715 /local/squid/cache/swap.state squid 13665 squid 23r FIFO0,6 1723506 pipe squid 13665 squid 24w FIFO0,6 1723503 pipe squid 13665 squid 25u IPv41723505 TCP *:squid (LISTEN) squid 13665 squid 26w FIFO0,6 1723506 pipe squid 13665 squid 27r FIFO0,6 1723507 pipe FD 27 * squid 13665 squid 28w FIFO0,6 1723507 pipe squid 13665 squid 29u IPv41771231 TCP srv004:squid->xxx (ESTABLISHED) It seems that FD 27 and 28 point to the same object (as FD 23 and FD 26) Hmm.. The strace output lists the parameters at the call time and we do not see the return values (only the function return value). Should be return values. strace knows how epoll_wait works. Is it possible that FD 27 has an EPOLLERR condition instead of a EPOLLIN condition and the loop is caused by trying to do read() on FD 27 ? (I do not see any check for a return of EPOLLERR in the sources). read should not return EAGAIN if there is an pending error condition. > Regards Henrik
Re: [squid-users] Squid loops on epoll/read/gettimeofday
fre 2010-04-02 klockan 15:41 -0300 skrev Marcus Kool: > strange indeed, but this is strace output with which I am not very familiar. > Strace should print the whole array that it uses as argument to > epoll_wait or just prints the first element ? (and the 2nd argument > could be FD 27) For epoll_wait it prints the valid entries in the result array, which has length 1 in this case (see return value last on the line). > > Can you please use cachemgr to view the current filedescriptors page to > > note what fd 23 & fd 27 is being used for? > > I restarted Squid and now it does not loop... > The lsof output in the original email is of little help: A little. Both are pipes. And it's not unlikely these pipes have the same fd assignments now after restart. What does cachemgr say now? > squid 13665 squid 20w REG9,3 1478687 4718702 > /local/squid/logs/access.log > squid 13665 squid 21r FIFO0,6 1723502 pipe > squid 13665 squid 22w REG9,3 30400812 4718715 > /local/squid/cache/swap.state > squid 13665 squid 23r FIFO0,6 1723506 pipe > squid 13665 squid 24w FIFO0,6 1723503 pipe > squid 13665 squid 25u IPv41723505 TCP *:squid > (LISTEN) > squid 13665 squid 26w FIFO0,6 1723506 pipe > squid 13665 squid 27r FIFO0,6 1723507 pipe > FD 27 * > squid 13665 squid 28w FIFO0,6 1723507 pipe > squid 13665 squid 29u IPv41771231 TCP > srv004:squid->xxx > (ESTABLISHED) > > It seems that FD 27 and 28 point to the same object (as FD 23 and FD 26) Hmm.. > The strace output lists the parameters at the call time and > we do not see the return values (only the function return value). Should be return values. strace knows how epoll_wait works. > Is it possible that FD 27 has an EPOLLERR condition instead of a > EPOLLIN condition and the loop is caused by trying to do read() on FD 27 ? > (I do not see any check for a return of EPOLLERR in the sources). read should not return EAGAIN if there is an pending error condition. Regards Henrik
Re: [squid-users] Squid loops on epoll/read/gettimeofday
Henrik Nordström wrote: mån 2010-03-29 klockan 13:58 -0300 skrev Marcus Kool: 0.33 epoll_wait(6, {{EPOLLIN, {u32=23, u64=8800387989527}}}, 2400, 10) = 1 0.32 gettimeofday({1269878848, 223083}, NULL) = 0 0.31 read(27, 0xffd3de98, 256) = -1 EAGAIN (Resource temporarily unavailable) This is odd.. epoll_wait says fd 23 is ready for reading, but then Squid reads on fd 27. strange indeed, but this is strace output with which I am not very familiar. Strace should print the whole array that it uses as argument to epoll_wait or just prints the first element ? (and the 2nd argument could be FD 27) Can you please use cachemgr to view the current filedescriptors page to note what fd 23 & fd 27 is being used for? I restarted Squid and now it does not loop... The lsof output in the original email is of little help: squid 13665 squid 20w REG9,3 1478687 4718702 /local/squid/logs/access.log squid 13665 squid 21r FIFO0,6 1723502 pipe squid 13665 squid 22w REG9,3 30400812 4718715 /local/squid/cache/swap.state squid 13665 squid 23r FIFO0,6 1723506 pipe squid 13665 squid 24w FIFO0,6 1723503 pipe squid 13665 squid 25u IPv41723505 TCP *:squid (LISTEN) squid 13665 squid 26w FIFO0,6 1723506 pipe squid 13665 squid 27r FIFO0,6 1723507 pipe FD 27 * squid 13665 squid 28w FIFO0,6 1723507 pipe squid 13665 squid 29u IPv41771231 TCP srv004:squid->xxx (ESTABLISHED) It seems that FD 27 and 28 point to the same object (as FD 23 and FD 26) Note that FD 27 and FD 28 have the same NODE. This pipe is used for what ??? Are you using aufs? If so then it's an internal notification channel to wake up the main thread when disk I/O have completed. Yes I use aufs. The EAGAIN return code to read is strange. It suggest that the read() could return data soon, but Squid is looping now for over 4 hours. Not strange if fd 27 is the aufs notification pipe. Well, the strange thing is that it loops and uses 100% CPU. The strange thing is why it does not act on fd 23 as returned by epoll_wait. The strace output lists the parameters at the call time and we do not see the return values (only the function return value). Is it possible that FD 27 has an EPOLLERR condition instead of a EPOLLIN condition and the loop is caused by trying to do read() on FD 27 ? (I do not see any check for a return of EPOLLERR in the sources). -Marcus Regards Henrik
Re: [squid-users] Squid loops on epoll/read/gettimeofday
mån 2010-03-29 klockan 13:58 -0300 skrev Marcus Kool: > 0.33 epoll_wait(6, {{EPOLLIN, {u32=23, u64=8800387989527}}}, 2400, > 10) = 1 > 0.32 gettimeofday({1269878848, 223083}, NULL) = 0 > 0.31 read(27, 0xffd3de98, 256) = -1 EAGAIN (Resource temporarily > unavailable) This is odd.. epoll_wait says fd 23 is ready for reading, but then Squid reads on fd 27. Can you please use cachemgr to view the current filedescriptors page to note what fd 23 & fd 27 is being used for? > Note that FD 27 and FD 28 have the same NODE. > This pipe is used for what ??? Are you using aufs? If so then it's an internal notification channel to wake up the main thread when disk I/O have completed. > The EAGAIN return code to read is strange. It suggest that the > read() could return data soon, but Squid is looping now for over 4 hours. Not strange if fd 27 is the aufs notification pipe. The strange thing is why it does not act on fd 23 as returned by epoll_wait. Regards Henrik
Re: [squid-users] Squid loops on epoll/read/gettimeofday
On Mon, 29 Mar 2010 13:58:58 -0300, Marcus Kool wrote: > I use squid > Squid Cache: Version 3.0.STABLE20 > configure options: '--prefix=/local/squid' '--with-default-user=squid' > '--with-filedescriptors=2400' '--enable-icap-client' > '--enable-storeio=aufs,ufs,null' > '--with-pthreads' '--enable-async-io=8' '--enable-removal-policies=lru' > '--enable-default-err-language=English' '--enable-err-languages=Dutch > English Portuguese' > '--enable-ssl' '--enable-cachemgr-hostname=localhost' > '--enable-cache-digests' > '--enable-follow-x-forwarded-for' '--enable-forw-via-db' > '--enable-xmalloc-statistics' > '--disable-hostname-checks' '--enable-epoll' '--enable-useragent-log' > '--enable-referer-log' > '--enable-underscores' 'CC=gcc' 'CFLAGS=-O2 -m32' 'CXXFLAGS=-O2 -m32' > > > Note that FD 27 and FD 28 have the same NODE. > This pipe is used for what ??? Something detailed in your cache.log when running at debug level ALL,1 (notices). > > The EAGAIN return code to read is strange. It suggest that the > read() could return data soon, but Squid is looping now for over 4 hours. > > Hopefully one of the developers knows what the pipe is used for and > can do a guess what is causing the EAGAIN return code. One of the helpers. Can't tell from the trace. Please also see if this is reproduced with the 3.1 release. Amos