Em 07/03/2017 15:22, Alex Rousskov escreveu:

2017/02/18 13:48:19 kid3| ERROR: worker I/O push queue for /cache4/rock 
overflow: ipcIo3.9082w9
2017/02/18 13:48:42 kid4| ERROR: worker I/O push queue for /cache4/rock 
overflow: ipcIo4.3371w9
2017/02/18 14:06:01 kid9| WARNING: /cache4/rock delays I/O requests for 9.97 
seconds to obey 200/sec rate limit
2017/02/18 14:06:34 kid9| WARNING: /cache4/rock delays I/O requests for 21.82 
seconds to obey 200/sec rate limit
2017/02/18 14:06:42 kid4| WARNING: abandoning 1 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:06:47 kid3| WARNING: abandoning 1 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:06:48 kid1| WARNING: abandoning 1 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:06:49 kid4| WARNING: abandoning 4 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:06:54 kid3| WARNING: abandoning 2 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:07:55 kid9| WARNING: /cache4/rock delays I/O requests for 68.64 
seconds to obey 200/sec rate limit
2017/02/18 14:08:03 kid5| WARNING: abandoning 511 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:08:47 kid2| WARNING: abandoning 20 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:08:51 kid3| WARNING: abandoning 41 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 14:08:54 kid1| WARNING: abandoning 41 /cache4/rock I/Os after at 
least 7.00s timeout
2017/02/18 15:26:35 kid5| ERROR: worker I/O push queue for /cache4/rock 
overflow: ipcIo5.31404w9
2017/02/18 15:29:00 kid9| WARNING: /cache4/rock delays I/O requests for 9.92 
seconds to obey 200/sec rate limit
2017/02/18 15:29:13 kid9| WARNING: /cache4/rock delays I/O requests for 8.23 
seconds to obey 200/sec rate limit
2017/02/18 15:29:45 kid9| WARNING: /cache4/rock delays I/O requests for 8.86 
seconds to obey 200/sec rate limit
2017/02/18 15:30:06 kid9| WARNING: /cache4/rock delays I/O requests for 7.34 
seconds to obey 200/sec rate limit
2017/02/18 15:30:27 kid9| WARNING: /cache4/rock delays I/O requests for 7.65 
seconds to obey 200/sec rate limit
2017/02/18 15:30:48 kid9| WARNING: /cache4/rock delays I/O requests for 8.97 
seconds to obey 200/sec rate limit
2017/02/18 15:31:09 kid9| WARNING: /cache4/rock delays I/O requests for 8.52 
seconds to obey 200/sec rate limit
2017/02/18 15:31:22 kid9| WARNING: /cache4/rock delays I/O requests for 10.61 
seconds to obey 200/sec rate limit
2017/02/18 17:19:40 kid9| WARNING: /cache4/rock delays I/O requests for 10.22 
seconds to obey 200/sec rate limit
There is a Squid bug and/or your cache disks could not keep up with the
I/O load. Please note that I/O load during initial cache index rebuild
is higher.

There are several ways to proceed IMO:

1. Figure out why your Squid is not using the disk cache _now_. This
will require enabling debugging, at least for a few seconds, and then
analyzing cache.log. I recommend enabling debugging in one worker kid
only (i.e., sending its process the right signal instead of running
"squid -k debug"). See Squid wiki for debugging details. Do not restart
or reconfigure Squid until you collect those debugging logs! Please note
that the logs may contain private user info.

Some log from right now...

2017/03/07 15:53:05.255 kid6| 19,4| MemObject.cc(431) isContiguous: MemObject::isContiguous: Returning true 2017/03/07 15:53:05.255 kid6| 73,3| HttpRequest.cc(658) storeId: sent back effectiveRequestUrl: http://www.agenciabelem.com.br/signalr/hubs 2017/03/07 15:53:05.255 kid6| 20,3| store_swapmeta.cc(52) storeSwapMetaBuild: storeSwapMetaBuild URL: http://www.agenciabelem.com.br/signalr/hubs 2017/03/07 15:53:05.255 kid6| 54,2| ../../../src/ipc/Queue.h(490) findOldest: peeking from 7 to 6 at 1 2017/03/07 15:53:05.255 kid6| 47,2| IpcIoFile.cc(415) canWait: cannot wait: 1136930911 oldest: ipcIo6.381049w7 2017/03/07 15:53:05.255 kid6| 20,2| store_io.cc(38) storeCreate: storeCreate: no swapdirs for e:=sw1p2RDV/0x206e17d0*4 2017/03/07 15:53:05.255 kid6| 90,3| store_client.cc(729) invokeHandlers: InvokeHandlers: 69538C9E111C077214004DE73D0F257B 2017/03/07 15:53:05.255 kid6| 90,3| store_client.cc(735) invokeHandlers: StoreEntry::InvokeHandlers: checking client #0 2017/03/07 15:53:05.255 kid6| 90,3| store_client.cc(755) storePendingNClients: storePendingNClients: returning 1 2017/03/07 15:53:05.255 kid6| 20,3| store.cc(494) unlock: Client unlocking key 69538C9E111C077214004DE73D0F257B e:=sp2RDV/0x206e17d0*4 2017/03/07 15:53:05.255 kid6| 17,3| FwdState.cc(274) ~FwdState: FwdState destructor start
--
2017/03/07 15:53:05.302 kid6| 90,3| store_client.cc(661) storeUnregister: storeUnregister: called for 'C8303E098834088226809BF615D12C51' 2017/03/07 15:53:05.302 kid6| 19,4| MemObject.cc(431) isContiguous: MemObject::isContiguous: Returning true 2017/03/07 15:53:05.302 kid6| 20,3| store_swapmeta.cc(52) storeSwapMetaBuild: storeSwapMetaBuild URL: http://www.ormnews.com.br/library/bootstrap/css/bootstrap.min.css 2017/03/07 15:53:05.302 kid6| 54,2| ../../../src/ipc/Queue.h(490) findOldest: peeking from 8 to 6 at 1 2017/03/07 15:53:05.302 kid6| 47,2| IpcIoFile.cc(415) canWait: cannot wait: 1136930958 oldest: ipcIo6.153009r8 2017/03/07 15:53:05.302 kid6| 20,2| store_io.cc(38) storeCreate: storeCreate: no swapdirs for e:=msw1DV/0x3a4d5870*2 2017/03/07 15:53:05.302 kid6| 20,3| store.cc(457) lock: storeUnregister locked key C8303E098834088226809BF615D12C51 e:=msDV/0x3a4d5870*3 2017/03/07 15:53:05.302 kid6| 90,3| store_client.cc(755) storePendingNClients: storePendingNClients: returning 0 2017/03/07 15:53:05.302 kid6| 20,3| store.cc(494) unlock: storeUnregister unlocking key C8303E098834088226809BF615D12C51 e:=msDV/0x3a4d5870*3 2017/03/07 15:53:05.302 kid6| 20,3| store.cc(494) unlock: clientReplyContext::removeStoreReference unlocking key C8303E098834088226809BF615D12C51 e:=msDV/0x3a4d5870*2 2017/03/07 15:53:05.302 kid6| 33,3| Pipeline.cc(69) popMe: Pipeline 0xc8a25a0 drop 0x1b8b2cd0*3
--
2017/03/07 15:53:05.318 kid6| 20,3| store_swapout.cc(377) mayStartSwapOut: already allowed 2017/03/07 15:53:05.318 kid6| 73,3| HttpRequest.cc(658) storeId: sent back effectiveRequestUrl: http://img.olx.com.br/images/31/310706022882854.jpg 2017/03/07 15:53:05.318 kid6| 20,3| store_swapmeta.cc(52) storeSwapMetaBuild: storeSwapMetaBuild URL: http://img.olx.com.br/images/31/310706022882854.jpg 2017/03/07 15:53:05.318 kid6| 54,2| ../../../src/ipc/Queue.h(490) findOldest: peeking from 7 to 6 at 1 2017/03/07 15:53:05.318 kid6| 47,2| IpcIoFile.cc(415) canWait: cannot wait: 1136930974 oldest: ipcIo6.381049w7 2017/03/07 15:53:05.318 kid6| 20,2| store_io.cc(38) storeCreate: storeCreate: no swapdirs for e:m381432=w1p2DV/0x3ace6740*4 2017/03/07 15:53:05.318 kid6| 90,3| store_client.cc(729) invokeHandlers: InvokeHandlers: 5E66A439D7F0C1ED9A6B0742518C59B6 2017/03/07 15:53:05.318 kid6| 90,3| store_client.cc(735) invokeHandlers: StoreEntry::InvokeHandlers: checking client #0 2017/03/07 15:53:05.318 kid6| 11,3| http.cc(1090) persistentConnStatus: local=10.1.10.9:59284 remote=104.113.39.233:80 FD 612 flags=1 eof=0 2017/03/07 15:53:05.318 kid6| 5,3| comm.cc(559) commSetConnTimeout: local=10.1.10.9:59284 remote=104.113.39.233:80 FD 612 flags=1 timeout 600 2017/03/07 15:53:05.318 kid6| 5,4| AsyncCallQueue.cc(55) fireNext: entering TunnelBlindCopyReadHandler(local=10.1.10.9:54832 remote=201.57.89.205:443 FD 144 flags=1, data=0x2d04
83d8, size=4344, buf=0x207b41b0)
--
2017/03/07 15:53:05.793 kid6| 20,3| store.cc(1342) validLength: storeEntryValidLength: Checking '96501FC1CC75A4D3B9B6BA8A2D899B45' 2017/03/07 15:53:05.793 kid6| 19,4| MemObject.cc(431) isContiguous: MemObject::isContiguous: Returning true 2017/03/07 15:53:05.793 kid6| 20,3| store_swapmeta.cc(52) storeSwapMetaBuild: storeSwapMetaBuild URL: http://globoesporte.globo.com/busca/?q=gabriel+jesus&ps=on 2017/03/07 15:53:05.793 kid6| 54,2| ../../../src/ipc/Queue.h(490) findOldest: peeking from 7 to 6 at 1 2017/03/07 15:53:05.793 kid6| 47,2| IpcIoFile.cc(415) canWait: cannot wait: 1136931448 oldest: ipcIo6.381049w7 2017/03/07 15:53:05.793 kid6| 20,2| store_io.cc(38) storeCreate: storeCreate: no swapdirs for e:=sw1V/0x206e17d0*1 2017/03/07 15:53:05.793 kid6| 90,3| store_client.cc(729) invokeHandlers: InvokeHandlers: 96501FC1CC75A4D3B9B6BA8A2D899B45 2017/03/07 15:53:05.793 kid6| 20,3| store.cc(494) unlock: StoreEntry::forcePublicKey+Vary unlocking key 96501FC1CC75A4D3B9B6BA8A2D899B45 e:=sV/0x206e17d0*1 2017/03/07 15:53:05.793 kid6| 90,3| store_client.cc(755) storePendingNClients: storePendingNClients: returning 0 2017/03/07 15:53:05.793 kid6| 20,3| store.cc(396) destroyStoreEntry: destroyStoreEntry: destroying 0x206e17d8 2017/03/07 15:53:05.793 kid6| 20,3| store.cc(378) destroyMemObject: destroyMemObject 0x354a3970

2. Reducing max-swap-rate from 200 to, say, 20. If your disks cannot
keep up _and_ there is a Squid bug that screws something up when your
disks cannot keep up, then this blind configuration change may avoid
triggering that bug.

I'll try to use this on a last resort attempt...

3. Collect enough iostat 5-second outputs (or equivalent) to correlate
system performance with cache.log messages. I would also collect other
system activity during those hours. The "atop" tool may be useful for
collecting everything in one place. You will probably want to restart
Squid for a clean experiment/collection.


Trying to figure out how to use atop information...

--
Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751

_______________________________________________
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev

Reply via email to