On 29 November 2010 10:11, Domas Mituzas <midom.li...@gmail.com> wrote: >> The sampled 1/1000 squid logs can be used for statistical purposes, such as >> page view stats. Someone more techy can answer that better than I can, if >> the samples include IP addresses that could be used w/ geoip for geographic >> analysis. (I think perhaps not) > > we do aggregations on full sample, not 1/1000 > 1/1000 gets saved to a file for post-mortems and "wtf is going on" type of > analysis.
Ah, that explains it - I was wondering how we could get something as precise as "three views one day, five the next" out of a 1/1000 sample! So am I right in assuming that what happens is: 1) page request comes in and is served 2) every thousandth request is sent to a separate file and logged 3) the rest are stripped of all data bar "X page requested" 4) this is kept for the pageview statistics, which are very fine-grained The end result: one file with 0.1% of requests logged in detail and another file with "hit counts" and no more. -- - Andrew Gray andrew.g...@dunelm.org.uk _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l