On 29 November 2010 10:11, Domas Mituzas <midom.li...@gmail.com> wrote:
>> The sampled 1/1000 squid logs can be used for statistical purposes, such as
>> page view stats.  Someone more techy can answer that better than I can, if
>> the samples include IP addresses that could be used w/ geoip for geographic
>> analysis. (I think perhaps not)
>
> we do aggregations on full sample, not 1/1000
> 1/1000 gets saved to a file for post-mortems and "wtf is going on" type of 
> analysis.

Ah, that explains it - I was wondering how we could get something as
precise as "three views one day, five the next" out of a 1/1000
sample! So am I right in assuming that what happens is:

1) page request comes in and is served
2) every thousandth request is sent to a separate file and logged
3) the rest are stripped of all data bar "X page requested"
4) this is kept for the pageview statistics, which are very fine-grained

The end result: one file with 0.1% of requests logged in detail and
another file with "hit counts" and no more.

-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Reply via email to