On Sat, 14 May 2005, Nick Lewycky wrote:

That's a very good plan. Does anyone have recent logs publicly
available? I have some IRCache logs for the day of May 31, 2004 -- but
when I tried the first 5,000 entries, I found that 87% of the prefetches
weren't fetched later in the log. I think this is mostly because the
pages changed after that date and also because of filtering effects from
client caching.

The ircache logs is a second level cache log. Not directly suitable for prefetching simulation.


There is/was other logs available. You need a log which shows end-user requests with only client caching, and including referer information so you can reconstruct the content & relationships somewhat. Using these logs as "real life" without simulated content is a bit hard as the available content has changed a lot.

I suspect your best is to use the available leaf cache logs (not the ircache logs) to build a view of the content model seen at the proxy, then use this as input to build simulated content where you can run repeated measurements both with and without prefetching. Next step if you are happy with the results would be real-life testing with the proxy instrumented to log a little more information than usual.. (referer, partial hit info, time since prefetch if prefetched, etc as you may find useful)

What I'd really like to have is a way to look at the page load times
instead of running through individual URLs.

page load times is a bit hard to measure at the proxy as it is dependent on a number of factors not easily seen from proxy logs or even proxy traffic..


Regards
Henrik

Reply via email to