Re: squid-prefetching status
Jon Kay wrote: > Nick Lewycky wrote: > >>Finally, does anyone have suggestions for how to test for performance >>improvement due to prefetching? > > A good way to test how your algorithms are working is to get a nice, long > actual Squid workload -eg, URLs fetched, and compare how long it takes > to execute the whole thing with and without prefetching. That's a very good plan. Does anyone have recent logs publicly available? I have some IRCache logs for the day of May 31, 2004 -- but when I tried the first 5,000 entries, I found that 87% of the prefetches weren't fetched later in the log. I think this is mostly because the pages changed after that date and also because of filtering effects from client caching. What I'd really like to have is a way to look at the page load times instead of running through individual URLs. > Note that you generally have to prefetch a LOT of stuff to get much > improvement, > because web cache fetch popularity follows zipf's law and decays slowly. I hadn't heard of Zipf's law. It's interesting, thank you for introducing me to it! Just to make certain I understand what you're saying ... you're noting that I need a lot of log data to test with because most fetches enter the working set where prefetching won't help and so I need a large number of cache misses? > Good luck with your work. Thank you! Nick Lewycky
Re: cvs commit: squid3/include Range.h
On Sat, 14 May 2005, Serassio Guido wrote: With an empty "port" acl, Squid crashes when dumping configuration in cachemgr: and non empty "port" acl are not working: this is the dump output of the default squid.conf: Fixed. Got a condition the wrong way around again (ListIterator eof was negated). Regards Henrik
Re: squid-prefetching status
Nick Lewycky wrote: > Hi. I've been working to add prefetching to squid3. It works by > analyzing HTML and looking for various tags that a graphical browser an > be expected to request. > > So far, it seems to just-barely work. What works is checking the > content-type of the document, avoiding encoded (gzip'ed) documents, > analyzing the HTML using libxml2 in "tag soup" mode, resolving the full > URL from relative references, and fetching the files into the cache. (I > would, of course, appreciate code reviews of the branch before I diverge > too far!) > > However, I've run into a few problems. > > To prefetch a page, we call clientBeginRequest. I've already had to > extend the richness of this interface a little. The main problem is that > it will open up a new socket for each call. On a page with 100 > prefetchables, it will open 100 TCP connections to the remote server. > That's not nice. I need a way to re-use a connection for multiple > requests. How should I do this? I'd like clientBeginRequest to be smart > enough to handle this behind the scenes. > > Occasionally I see duplicate prefetches. I think what's going on here is > that the object is uncacheable. The only way I can think of solving this > is by adding an "uncacheable" entry type to the store -- but that just > seems wrong, conceptually. On a related note, maybe we could terminate a > prefetch as soon as we receive the headers and notice that it's > uncacheable. Currently, we download the whole thing and just discard it > (after analyzing it for more prefetchables if it's HTML). > > Finally, does anyone have suggestions for how to test for performance > improvement due to prefetching? A good way to test how your algorithms are working is to get a nice, long actual Squid workload -eg, URLs fetched, and compare how long it takes to execute the whole thing with and without prefetching. Note that you generally have to prefetch a LOT of stuff to get much improvement, because web cache fetch popularity follows zipf's law and decays slowly. Good luck with your work. Jon
Re: cvs commit: squid3/include Range.h
Hi Henrik, At 01.28 09/05/2005, [EMAIL PROTECTED] wrote: hno 2005/05/08 17:28:06 MDT Modified files: include Range.h Log: const correctness Revision ChangesPath 1.6 +2 -2 squid3/include/Range.h With an empty "port" acl, Squid crashes when dumping configuration in cachemgr: 2005/05/14 18:59:10| Warning: empty ACL: acl bad_port port Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 31519)] 0x08056a37 in Range::size (this=0x4) at Range.h:80 80 return end > start ? end - start : 0; (gdb) backtrace #0 0x08056a37 in Range::size (this=0x4) at Range.h:80 #1 0x08059d94 in ACLStrategised::dump (this=0x4) at ACLStrategised.h:166 #2 0x08051357 in ACL::dumpGeneric (this=0x0) at acl.cc:563 #3 0x08065bce in dump_acl (entry=0x4068c8c0, name=0x816b2b6 "acl", ae=0x4) at cache_cf.cc:811 #4 0x0806cfd5 in dump_config (entry=0x4068c8c0) at cf_parser.h:1620 #5 0x0807138d in cachemgrStart (fd=140053524, request=0x85940d8, entry=0x4068c8c0) at cache_manager.cc:332 and non empty "port" acl are not working: this is the dump output of the default squid.conf: acl to_localhost dst 127.0.0.0/255.0.0.0 acl SSL_ports port acl Safe_ports port acl CONNECT method CONNECT Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: [EMAIL PROTECTED] WWW: http://www.acmeconsulting.it/
Re: squid 2.5 with icap (fwd)
Henrik Nordstrom wrote: Hello Henrik, I dont know who is responsible for icapclient development in squid, if not you are, please forward it. We have been using the squid with icap support. We found the following problem in squid icap client: When an HTTP server sends a response to squid without HTTP header (according to HTTP/0.9), squid makes wrong icap request, so an icapserver cannot parse it. The HTTP part of squid works good with such request, but the icap client does not. Unfortunatelly there are HTTP servers that uses this old protocol. Yes the problem exist.. Sorry, the private information in the icap request is skipped. We made a patch to fix this problem, and would like somebody to add it to the squid icap development branch. I do not want to apply, there are thinks that I do not like in this patch. But I am going to make my solution (maybe based on their patch). -- Christos