Hi.

I'm using squid for a long time, I'm using it to authenticate/authorize users accessing the Internet with LDAP in a Windows corporate enviromnent (Basic/NTLM/GSS-SPNEGO) and recently (about several months ago) I had to switch to the SMP scheme, because one process started to eat the whole core sometimes, thus bottlenecking users on it. Situation with CPU effectiveness improved, however I discovered several issues. The first I was aware of, it's the non-functional SNMP (since there's no solution, I just had to sacrifice it). But the second one is more disturbing. I discovered that after a several uptime (usually couple of weeks, a month at it's best) squid somehow degrades and stops authorizing users. I have about active 600 users on my biggest site (withount SNMP I'm not sure how many simultaneous users I got) but usually this starts like this: someone (this starts with one person) complains that he lost his access to the internet - not entirely, no. At first the access is very slow, and the victim has to wait several minutes for the page to load. Others are unaffected at this time. From time to time the victim is able to load one of two tabs in the browser, eventually, but at the end of the day this becomes unuseable, and my support has to come in. Then this gots escalated to me. First I was debugging various kerberos stuff, NTLM, victim's machine domain membership and so on. But today I managed to figure out that all I have to do is just restart squid, yeah (sounds silly, but I don't like to restart things, like in the "IT Crowd" TV Series, this is kinda last resort measure, when I'm desperate). If I'm stubborn enough to continue the investigation, soon I got 2 users complaining, then 3, then more. During previous outages eventually I used to restart squid (to change the domain controller in kerberos config, if I blame one; to disable the external Kerberos/LDAP helper connection pooling, if I blame one) - so each time there was a candidate to blame. But this time I just decided to restart squid, since I started to think it's the main reason, et voila. I should also mention that I run this AAA scheme in squid for years, and I didn't have this issue previously. I also have like dozen of other squids running same (very similar) config, - same AAA stuff - Basic/NTLM/GSS-SpNego, same AD group checking, but only for the different groups membership - and none of it has this issue. I'm thinking there's SMP involved, really.

I realize this is a poor problem report. "Something degrades, I restart squid, please help, I think it's SMP-related". But the thing is - I don't know where to start to narrow this stuff. If anyone's having a good idea please let me know.

Thanks.
Eugene.
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to