Re: [squid-users] Changing available bandwidth for *existing*connections based on time of day?
On Jan 26, 2011, at 9:27 PM, Scott Lehman wrote: > My searches haven't turned up much so far (dynamic delay pool patch looked > different), and my only idea is to restart Squid with a different config and > hope clients can resume where they left off. > > Any pointers appreciated, even if it's not an all-Squid solution. I'm struggling with a similar planning issue here... my best thought so far is to try and get Squid to mark connections with TOS and then rely on other software to shape the connections (iptables or my openbsd firewall). Those support time-based rules (or, for PF, an easy way to reset state info), so the time-based shaping could be a little granular. Haven't quite gotten around to trying any of this, but that's what I was thinking about. I'm happy to hear what others have tried... Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] TPROXY and DansGuardian
On Mar 24, 2010, at 1:37 AM, Amos Jeffries wrote: > From what I understand of your requirements you don't actually need DG or > anything but Squid alone. Squid can log in any format you choose to > configure. If there is anything it does not yet log we'd be interested in > hearing about that. DG will do content-based filtering (check the HTML for naughty words), which is of interest to us. Otherwise, you're correct in that we could just log all accesses and run a URL analyzer to see if people are going somewhere they shouldn't. Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
[squid-users] TPROXY and DansGuardian
We've used a few different Squid setups over the years, from a vanilla setup to a transparent interception proxy, to a fully transparent tproxy. We're now using DansGuardian to keep tabs on our users (we don't block; we just monitor). This is good, but unfortunately it doesn't appear to be compatible with tproxy (DG only understands interception or regular proxying). Does anyone know of a way to use DG as an interception proxy, but configure Squid to use the "real" client IP address in its outgoing requests? I have no idea if this is possible since it would be quite a mess of different proxy schemes (DG would be interception-based using routing, Squid would use X-Forwarded-For to get the real IP, and then tproxy to make the request using the client address). Alternately, does anyone know of a good web monitoring product that works in a "sniffer" mode so I don't need to insert it inline? I basically would like to use tproxy, but also need to log users who are going to naughty sites... Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
[squid-users] Time-based ACLs for long connections
I work at a school where we would like to limit bandwidth during certain times of day (study times), but relax those restrictions at other times of day. We're looking into delay pools to shape the traffic, and time-based ACLs to assign the connections to different pools. I'm pretty sure my guess about how time-based ACLs work is correct, but I wanted to confirm before I set this all up and have a major "duh" moment. Assignment to a delay pool using a time-based ACL would only occur at the start of the connection, correct? In other words, if I have a ACLs like: acl play time MTWHF 15:00-19:59 acl work time MTWHF 20:00-22:59 # plus some assignment to delay pools using "play" and "work" ACLs... Then a user who starts a long download at 19:59 will enjoy the benefits of the "play" ACL for the duration of the download, while one who starts at 22:59 will be penalized with the "work" ACL until their download completes. There is no re-evaluation of the ACL while the download is in progress that would notice that the ACL time boundary has been crossed. Just want to check before I start architecting our solution... Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] RE: Squid Question?
On Jan 9, 2010, at 8:54 AM, Amos Jeffries wrote: > MISS on the local proxy, fetched from a parent peer whose cache/storage > digest (CD_) claimed to have it over there. Not to be dense, but I want to make sure I'm interpreting this correctly (for my cache reports). Since in my multi-instance setup the frontend (client) and backend (parent) processes are all running on the same local machine, if *either* the local or the hierarchy list "HIT" in their response code, I should count it as a HIT overall, right? In other words, a parent HIT is just as good as a child HIT in this case, since there's no bandwidth cost between the parents and the children for this particular setup. Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] RE: Squid Question?
I'd like to pile on with a log question of my own... Now that I'm running multi-instance I've got some parent servers in the mix. I'm getting a lot of lines like: ... TCP_MISS/200 1494 GET http://example.com - CD_PARENT_HIT/backend-bravo ... (for many different URLs) So, is that a MISS, or a HIT? My guess is that the cache digest said it was a hit, but when we actually fetched it, it ended up being a MISS (stale, outdated, etc), but I'm not sure. Additionally, I get: ... TCP_MISS/200 2895 GET http://example.com - PARENT_HIT/backend-charlie ... Is that the same, just with a UDP query instead of a cache digest? Again, is it a HIT or a MISS? Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] Multiple-Instance Questions
On Jan 5, 2010, at 7:08 PM, Chris Robertson wrote: > If the requests made to the parents has been normalized by the > url_rewrite_program, then you shouldn't need a storeurl_rewrite_program at > all. I'm thinking about YouTube and other CDN content, where I can't rewrite the request (since that would make the URL invalid), but I want to map URLs to a single canonical instance. My theory is that since this happens just before the content hits the store, it will need to happen independently on each backend. However, I wasn't quite sure how this would interact with peering (will the frontend ask the peers for the real URL, or the storeurl_rewritten version)? >> - I'm using round-robin for the backend peer selection. Should I switch to >> carp, or is that overkill when the instances are on the same physical box? >> What advantages does carp have? > > CARP routes a given request to the same parent every time. Since cache can't > be shared between instances (on the same server or not) this can aid caching > efficiency. Sounds like an excellent reason to me. =) I'll definitely make that change, then. Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
[squid-users] Multiple-Instance Questions
I've been inspired by the recent discussions to set up my own multiple-instanace Squid box. I've got it running with some real connections in a basic config, and I think I've got most of the general config issues worked out. I have questions about some of the advanced tweaking and how it works with multiple instances, and I'm also looking for a general sanity check on my config... We have 7 instances of Squid 2.7STABLE running on a single box right now. 2 are "frontend" (used by clients), and 5 are "backend" (parent peers to the frontends). The frontend instances have all the ACLs, delay pools, and access logging. They are configured for memory-only caching (no storedir), and max out at 4GB process space. The backend instances have small cache_mem and have the storedirs defined. They don't do any delay pools, ACLs or logging (unless I need debugging). I'm wondering: - Do delay pools work properly if they're only on the frontend, or are they ignored if a peer handles the request for us? I'm hoping they do, otherwise there's no good way to enforce consistent limits when spread across 5 backends. - If I use a rewrite program, should I have it on just the frontends, just the backends, or all instances (I'm not sure how the rewrite affects peer requests)? - Similarly, should I use storeurl_rewrite_program on the frontends, the backends, or all instances (I suspect it should be "all")? - I'm using round-robin for the backend peer selection. Should I switch to carp, or is that overkill when the instances are on the same physical box? What advantages does carp have? Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] report of subdomains
On Jan 2, 2010, at 10:06 PM, Guido Marino Lorenzutti wrote: > Hi people: anyone knows of a log analizer (like sarg) that joins the > subdomains in the reports, so you can know how much is consumed by domain? > without this is impossible to know how much is transferred in rapidshare, > facebook, etc. I think Calamaris does a "2nd-level domain" report, so you see the top-N domains (e.g., "apple.com", "facebook.com", etc). If that isn't quite what you want, you could probably hack up one of the other scripts to include the top- and second-level domain name instead of just the top. We used to use Calamaris, but have since switched to an in-house script that provides much of the same functionality. The 2nd-level domain report is one of our staples... Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] Setting up two NICs with Squid/DANSGuardian
On Dec 14, 2009, at 8:47 AM, wrote: > Another possibility? If I'm hearing you correctly, it sounds like you want to change the outgoing address that squid uses to make connections: http://www.squid-cache.org/Doc/config/tcp_outgoing_address/ Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] COSS cache_dir size exceeds largest offset at maximum file size
On Dec 7, 2009, at 5:05 PM, Amos Jeffries wrote: > Yes, not sure what I was thinking yesterday. (multiplying by block size > twice, sheesh) The constant is inside Squid. Maximum of 2^25-1 files per > cache == "largest file offset". > cache size < largest file offset * block size. > > Plus the default 10 COSS in-memory stripes (10MB in your case AFAICT) are > counted as part of the total cache file space, but not mentioned in that > mini report. Ah, thank you. I would have been tearing my hair out wondering why those last 10 megs didn't work... I rebuilt using ' --with-build-environment=POSIX_V6_ILP32_OFFBIG' explicitly in the debian rules, and that seems to have brought me back up to the full 128 gigs (I'll find out for sure once dd finishes paving out the COSS store, but for now `squid -k check` seems OK with it). The working line is: cache_dir coss /mumble/cosstest 131061 block-size=8192 Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
Re: [squid-users] COSS cache_dir size exceeds largest offset at maximum file size
On Dec 7, 2009, at 1:37 AM, Amos Jeffries wrote: > For some reason the safety check that is catching you uses > instead of >= > I'm not sure why. If you want to experiment you could change it manually and > rebuild. Around line 864 of src/fs/coss/store_dir_coss.c. It looks like I'm off by more than just a single unit: 2009/12/05 12:16:00| COSS largest file offset = 4194296 KB 2009/12/05 12:16:00| COSS cache_dir size = 134217728 KB The largest file I'm able to use is 4095 MB, instead of the 131072 MB requested. Am I smacking up against some architecture-specific constant? Possibly related: I'm just using the standard Debian build process, so maybe it isn't guessing everything correctly. I want a 32-bit build with a maximum address size of 4GB and largefile support. Should I be explicitly passing in something like: --with-build-environment=POSIX_V6_ILP32_OFFBIG Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
[squid-users] COSS cache_dir size exceeds largest offset at maximum file size
Hello, I'm trying to set up a large COSS cache_dir. I've used COSS in the past, but I've never tried to max out on the size of the file (I'm not even sure if this is a good idea...) When I try to start, squid fails with the following log messages: 2009/12/05 12:16:00| parse_line: cache_dir coss /squid/small/coss01 131072 block-size=8192 max-size=1048576 maxfullbufs=256 2009/12/05 12:16:00| COSS block-size = 8192 bytes 2009/12/05 12:16:00| COSS largest file offset = 4194296 KB 2009/12/05 12:16:00| COSS cache_dir size = 134217728 KB FATAL: COSS cache_dir size exceeds largest offset I think I'm using the "block-size" param correctly to get to the maximum size of 131072MB. However, Squid seems to disagree on the maximum file offset. I've compiled using --with-large-*, so I'm not sure what I'm missing here: `squid -v` says: configure options: '--prefix=/usr' '--exec_prefix=/usr' '--bindir=/usr/sbin' '--sbindir=/usr/sbin' '--libexecdir=/usr/lib/squid' '--sysconfdir=/etc/squid' '--localstatedir=/var/spool/squid' '--datadir=/usr/share/squid' '--enable-async-io' '--with-pthreads' '--enable-storeio=ufs,aufs,coss,diskd,null' '--enable-linux-netfilter' '--enable-arp-acl' '--enable-epoll' '--enable-snmp' '--enable-removal-policies=lru,heap' '--enable-delay-pools' '--enable-htcp' '--enable-cache-digests' '--enable-auth=basic,digest,negotiate' '--enable-negotiate-auth-helpers=squid_kerb_auth' '--enable-carp' '--with-large-files' '--with-maxfd=65536' '--disable-ident-lookups' '--enable-follow-x-forwarded-for' '--enable-forw-via-db' '--enable-large-cache-files' 'sparc-debian-linux' 'build_alias=sparc-debian-linux' 'host_alias=sparc-debian-linux' 'target_alias=sparc-debian-linux' 'CFLAGS=-Wall -g -O2 -pipe -mcpu=niagara' 'LDFLAGS=' 'CPPFLAGS=-O2 -pipe -mcpu=niagara' This is squid-2.7STABLE7 running on Debian Linux kernel 2.6.31 (SMP/64bit) on a SunFire T2000 (niagara1). Thanks, Jason -- Jason Healy|jhe...@logn.net| http://www.logn.net/
[squid-users] Squid 2.6 + Debian Etch + tproxy + bridge + transparent proxy
Hey all, I've been a happy user of Squid for the past 10 years or so, and I'd like to take a second to thank everyone who has worked so hard to make such a great piece of software! I'd like to give back to the Squid community, but unfortunately I'm not much of a C hacker. However, I'm hoping I can still help. I've just spent a few days getting my school's Squid install up to date (we were running 2.5 on Debian Woody). I switched to using tproxy this time around (we used to do policy routing on our core, but it was spiking the CPU too much). Thanks to the mailing list, some articles on the web, and a little messing around I was able to get the whole system up and running. I've documented the steps here: http://web.suffieldacademy.org/ils/netadmin/docs/software/squid/ The document is written for someone with a decent grasp of Linux, and is specifically geared to Debian Etch. There are some tweaks that are specific to our install (compile-time flags, mostly), but otherwise it's pretty generic. Hopefully, this will help someone else out who's trying to build a similar system, so I'm posting so it will hit the archives. Feel free to adapt, add to the wiki, or mirror if you find it useful. Also, any corrections are welcome. ;-) Thanks again for all your efforts! Jason -- Jason Healy|[EMAIL PROTECTED]| http://www.logn.net/
Re: [squid-users] Difficulty accessing SWG site
At 1137187311s since epoch (01/13/06 10:21:51 -0500 UTC), Mark Elsen wrote: > > No, I still get the delay (that was Test #3 in my original message). > > Checkout, using squid in your standard mode what the site return headers > are with : > > http://web-sniffer.net/ Site returns immediately with: HTTP Status Code: HTTP/1.1 200 OK Date: Mon, 16 Jan 2006 13:32:11 GMT Server: Apache/2.0.55 (Unix) mod_jk/1.2.14 Set-Cookie: LiSESSIONID:swg-0=F66E2965972B33179522039FA98C4121; Path=/ Pragma: No-cache Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control:no-cache Vary: Accept-Encoding Connection: close Content-Type: text/html;charset=UTF-8 For comparison, here are the headers returned when I connect through my transparent proxy (which causes the 30-second delay): HTTP Status Code: HTTP/1.0 200 OK Date: Mon, 16 Jan 2006 13:47:50 GMT Server: Apache/2.0.55 (Unix) mod_jk/1.2.14 Set-Cookie: LiSESSIONID:swg-0=3F69975ACF35B4AD5A1248C0B9790F08; Path=/ Pragma: No-cache Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control:no-cache Vary: Accept-Encoding Content-Type: text/html;charset=UTF-8 X-Cache: MISS from proxy.suffieldacademy.org X-Cache-Lookup: MISS from proxy.suffieldacademy.org:3128 Connection: close So it doesn't look like there's anything weird in the headers... Jason -- Jason Healy http://www.logn.net/
Re: [squid-users] throughput limitation from cache
At 1137150799s since epoch (01/13/06 00:13:19 -0500 UTC), Richard Mittendorfer wrote: > Also sprach Jason Healy <[EMAIL PROTECTED]> (Thu, 12 Jan 2006 > 22:37:58 -0500 (EST)): > > What are you using for your speed tests? I'm using wget, so I know > > there's no browser cache issue. > > Originally I do a apt-get (it prints the downloadspeed), certainly wget > gives me same results. > > 100Mb/s FD switched Ethernet too. With quite good performance > NFS(~9.5MB/s) or FTP. Running out of ideas here... Have you verified in your cache logs (access.log) that the files you're downloading are cache HITs? Most of the Debian mirrors I've used don't do much better than ~250KB/s if they're busy. If for some reason you weren't loading a cached version, you might just be seeing the max download speed from the mirror. Jason -- Jason Healy http://www.logn.net/
Re: [squid-users] Difficulty accessing SWG site
At 1137154743s since epoch (01/13/06 01:19:03 -0500 UTC), Mark Elsen wrote: > Does it work , when the user , uses SQUID directly, through proxy > settings in the browser ? No, I still get the delay (that was Test #3 in my original message). Jason -- Jason Healy http://www.logn.net/
Re: [squid-users] throughput limitation from cache
At 1137142598s since epoch (01/12/06 21:56:38 -0500 UTC), Richard Mittendorfer wrote: > Well, can't reach this here. Cached ~260KkB/s. And I'm quite sure the > file was still in the linux disk cache. What does your cache_dir looks > like? aufs I assume. 27GB on our root filesystem: cache_dir aufs /var/spool/squid 27000 16 256 > Since you've got plenty of RAM - maybe this is the reason? Could be; I've thrown 1GB of RAM to cache_mem, so I should be holding a fair amount of stuff. Additionally, I've tuned the swap to a large cache size. Finally, the maximum_object_size is 160MB to make sure I'm keeping large OS updates on hand (saves us a lot of time for OS patches). I've set the "swapiness" of the linux kernel to a low level, to prevent it from swapping too aggressively. > Or it's some kind of autotuned by download speeds? Can't think of. What are you using for your speed tests? I'm using wget, so I know there's no browser cache issue. Also, what network are you testing on? If I use our 802.11 wireless network, I can't get much over 280-300KB/s, because I start to bump up against the 11Mb/s limit of WIFI. The 2MB/s rate I quoted you earlier was over switch 100Mb/s ethernet. > The IDE HD's that carries the storage are spun down most of the > time. You don't think you're waiting on spinup for some of these requests, do you? What's the largest file you've downloaded? If it's small enough, any spinup time might adversely affect your average bitrate. HTH, Jason -- Jason Healy http://www.logn.net/
Re: [squid-users] throughput limitation from cache
At 1137138557s since epoch (01/12/06 20:49:17 -0500 UTC), Richard Mittendorfer wrote: > It's even if I'm the only client and it's one big file that's retrieved, > so it must be some kind of internal limit. I have to look into the > source, maybe I can find it hardcoded somewhere. 256kB/s looks so > artificial ;) Not too sure about that. I just downloaded a non-cached file through our proxy and broke to 270KB/s (this is the busiest time of day for us, though). I know I've done better than that when it's quiet. If I turn around and request the same file again (now that it's cached), I'm pulling >2.0MB/s without any trouble. We're on a P3-850MHz with 1.5GB RAM and 30GB SCSI RAID1. I'm hoping to upgrade by the end of the month. ;-) > Had a look at it. Doesn't look like debian's squid is compiled with > async-io. ..hmm - - sure, debian's is async-io. Must > be. aufs _is_ compiled in: --enable-storeio=ufs,aufs,diskd,null Confirmed that it has it. We're on a stock config of Debian 3.1: # squid -v Squid Cache: Version 2.5.STABLE9 configure options: --enable-async-io --with-pthreads --enable-storeio=ufs,aufs,diskd,null --enable-linux-netfilter --enable-arp-acl --enable-removal-policies=lru,heap --enable-snmp --enable-delay-pools --enable-htcp --enable-poll --enable-cache-digests --enable-underscores --enable-referer-log --enable-useragent-log --enable-auth=basic,digest,ntlm --enable-carp --with-large-files i386-debian-linux Jason -- Jason Healy http://www.logn.net/
[squid-users] Difficulty accessing SWG site
We run a transparent squid proxy (2.5.STABLE9). Recently I've received complaints about one site in particular: the Star Wars Galaxies site (we're a high school, if that helps explain anything). It takes a lot longer to load than any other site, and I'm at a loss to explain why. I have packet traces (which I can post if they'll help) which appear to show that when Squid attempts the connection to the server, it stalls and re-sends the initial request several times. My firewall appears to disregard these additional packets. After 10-30 seconds, the request does make it through, and then the page downloads regularly. Since it's only for this site, I'm fairly confident that it isn't a general issue with my configuration. To help isolate the issue, I've attempted the following: - Connect straight through the firewall, with no transparent proxy. The connection works fine, with no delay. This shows that taking the squid box out of the equation resolves the problem. - Connect to the server from the squid box, but do not route the request through squid. The connection works fine. This shows that taking the squid process out of the equation resolves the problem. - Connect through squid, but using "real" proxy settings (instead of interception). Site loading shows the delay. This shows that *any* type of squid proxying (not just interception) causes the issue. Any suggestions on what I should try next? The URL I'm trying to hit is: http://forums.station.sony.com/swg/ My only guesses are some kind of weird TCP options, or the server not liking extra headers added in by the squid server. I'm not saying that squid is doing anything wrong (I have no problem telling the users if the web server is behaving badly), but I need to have some kind of proof to back it up. TIA, Jason -- Jason Healy http://www.logn.net/
Re: [squid-users] httpReadReply: Request not yet fully sent / POST problems?
At 1133813601s since epoch (12/05/05 09:13:21 -0500 UTC), Mark Elsen wrote: > > I have users that are complaining of failed uploads, mostly to picture > > sites (ofoto, shutterfly, snapfish), but also to other sites > > (facebook.com, myspace, etc). Checking the cache logs, the only > > suspicious activity are lines that look like this: > > > > (example of a Costco-branded snapfish upload) > > > > 2005/12/02 14:09:43| httpReadReply: Request not yet fully sent "POST > > http://64.147.178.206/uploadimagebasic.suup?authcode=& > > HOST_NAME=http://www.costcophotocenter.com"; > > > > Transparant proxy setups may leed to your squid host deciding > (or getting confused w.r.t.) > incorrect PMTU value(s). > > Does it work when the user is set to use the proxy directly, > by using proxy settings in the browser ? Excellent call; that does fix the issue! If you don't mind my asking, where should I begin looking to correct this issue? The firewall/router that's doing the redirection is an OpenBSD box running pf, and the squid box is Debian Stable with ip forwarding turned on to allow the transparent proxying (net.ipv4.ip_forward=1). I notice that Squid 3 has a setting to address this issue, and it seems to pertain to my setup (squid is not handling connection tracking; the oBSD box is). I'd prefer to stick with a stable release; is there any corrective action for this under 2.5 (such as disabling PMTU on the proxy machine entirely)? Thanks, Jason -- Jason Healy http://www.logn.net/
[squid-users] httpReadReply: Request not yet fully sent / POST problems?
Greetings! I have a Debian Stable machine running the stock version of Squid (2.5.STABLE9). It's set up as a transparent cache, with all traffic over TCP port 80 redirected to the proxy. Additionally, we run adzap as a redirector. I have users that are complaining of failed uploads, mostly to picture sites (ofoto, shutterfly, snapfish), but also to other sites (facebook.com, myspace, etc). Checking the cache logs, the only suspicious activity are lines that look like this: (example of a Costco-branded snapfish upload) 2005/12/02 14:09:43| httpReadReply: Request not yet fully sent "POST http://64.147.178.206/uploadimagebasic.suup?authcode=& HOST_NAME=http://www.costcophotocenter.com"; I searched the FAQ and the list for that error message, and all I could find was this: http://marc.theaimsgroup.com/?l=squid-users&m=111887530904985&w=2 The message suggests that the message can be safely ignored, but it seems that there is a correlation between the messages and my failed uploads. My question is: is this indicitive of the error I'm seeing, or should I be looking elsewhere for the solution to my POST problems? Stuff I've tried so far: - Verified that POSTs work when I drop the proxy out of the equation (turn off the transparent redirect through the squid box). - Turned off all redirectors (through adzapper) under Squid so it's just a straight caching proxy; problem still persists. - Turned on explicit DIRECT for all POST method connections (pretty sure this doesn't change anything, but I'm just trying stuff out before I waste your time); problem still persists. - Double-checked my "request_body_max_size" and other limit params to make sure I'm not killing the connections on my end; everything is set to "unlimited" (I can post my whole config if people would like). - Sniffed some traffic on the wire, but didn't see anything immediately wrong. It looks like an initial burst of traffic during the POST, which quickly slows to a trickle. Eventually, the connection dies (or the user gives up), and the upload is aborted. I'm not sure why things would taper this way; I'm not using delay pools, and my upstream bandwidth isn't a problem. Any suggestions would be most welcome. Thanks, Jason -- Jason Healy http://www.logn.net/