Re: [squid-dev] Rock store stopped accessing discs

Heiler Bemerguy Tue, 14 Mar 2017 09:44:28 -0700


Em 07/03/2017 20:26, Alex Rousskov escreveu:

These stuck disker responses probably explain why your disks do not
receive any traffic. It is potentially important that both disker
responses shown in your logs got stuck at approximately the same
absolute time ~13 days ago (around 2017-02-22, give or take a day;
subtract 1136930911 milliseconds from 15:53:05.255 in your Squid time
zone to know the "exact" time when those stuck requests were queued).


How can a disker response get stuck? Most likely, something unusual
happened ~13 days ago. This could be a Squid bug and/or a kid restart.

* Do all currently running Squid kid processes have about the same start
time? [1]

* Do you see ipcIo6.381049w7 or ipcIo6.153009r8 mentioned in any old
non-debugging messages/warnings?

I searched the log files from those days, nothing unusual, "grep"returns nothing for ipcIo6.381049w7 or ipcIo6.153009r8.

On that day I couldn't verify if the kids were still with the sameuptime, I've reformatted those /cache2 /cache3 and /cache4 partitionsand started fresh with squid -z, but looking at the PS right now, I feelI can answer that question:


root@proxy:~# ps auxw |grep squid-

proxy 10225 0.0 0.0 13964224 21708 ? S Mar10 0:10(squid-coord-10) -sproxy 10226 0.1 12.5 14737524 8268056 ? S Mar10 7:14(squid-disk-9) -sproxy 10227 0.0 11.6 14737524 7686564 ? S Mar10 3:08(squid-disk-8) -sproxy 10228 0.1 14.9 14737540 9863652 ? S Mar10 7:30(squid-disk-7) -sproxy 18348 3.5 10.3 17157560 6859904 ? S Mar13 48:44(squid-6) -sproxy 18604 2.8 9.0 16903948 5977728 ? S Mar13 37:28(squid-4) -sproxy 18637 1.7 10.8 16836872 7163392 ? R Mar13 23:03(squid-1) -sproxy 20831 15.3 10.3 17226652 6838372 ? S 08:50 39:51(squid-2) -sproxy 21189 5.3 2.8 16538064 1871788 ? S 12:29 2:12(squid-5) -sproxy 21214 3.8 1.5 16448972 1012720 ? S 12:43 1:03(squid-3) -s

Diskers aren't dying but workers are, a lot.. with that "assertionfailed: client_side_reply.cc:1167: http->storeEntry()->objectLen() >=headers_sz" thing.

Viewing DF and IOSTAT, it seems right now /cache3 isn't being accessedanymore. (I think it is the disk-8 above, look at the CPU time usage..)

Another weird thing: lots of timeouts and overflows are happening onnon-active hours.. From 0h to 7h we have like 1-2% of the clients weusually have from 8h to 17h.. (commercial time)

2017/03/14 00:26:50 kid3| WARNING: abandoning 23 /cache4/rock I/Os afterat least 7.00s timeout2017/03/14 00:26:53 kid1| WARNING: abandoning 1 /cache4/rock I/Os afterat least 7.00s timeout2017/03/14 02:14:48 kid5| ERROR: worker I/O push queue for /cache4/rockoverflow: ipcIo5.68259w92017/03/14 06:33:43 kid3| ERROR: worker I/O push queue for /cache4/rockoverflow: ipcIo3.55919w92017/03/14 06:57:53 kid3| ERROR: worker I/O push queue for /cache4/rockoverflow: ipcIo3.58130w9


This cache4 partition is where huge files would be stored:
maximum_object_size 4 GB

cache_dir rock /cache2 110000 min-size=0 max-size=65536max-swap-rate=150 swap-timeout=360cache_dir rock /cache3 110000 min-size=65537 max-size=262144max-swap-rate=150 swap-timeout=380cache_dir rock /cache4 110000 min-size=262145 max-swap-rate=150swap-timeout=500

Still don't know how /cache3 stopped and /cache4 is still active, evenwith all those warnings and errors.. :/


--
Atenciosamente / Best Regards,

Heiler Bemerguy
Network Manager - CINBESA
55 91 98151-4894/3184-1751

_______________________________________________
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev

Re: [squid-dev] Rock store stopped accessing discs

Reply via email to