Just to clarify, will this affect the stats at http://dumps.wikimedia.org/other/pagecounts-raw/ ? Changing the format of that will probably break third party scripts. -- -bawolff
On Fri, Jan 25, 2013 at 1:41 PM, Diederik van Liere <dvanli...@wikimedia.org> wrote: > Apologies for crossposting > > Heya, > > The Analytics Team is planning to deploy "tab as field delimiter" to > replace the current space as fielddelimiter on the varnish/squid/nginx > servers. We would like to do this on February 1st. The reason for this > change is that we need to have a consistent number of fields in each > webrequest log line. Right now, some fields contain spaces and that require > a lot of post-processing cleanup and slows down the generation of reports. > > What is affected and maintained by Analytics > > * udp-filter already has support for the tab character > * webstatscollector: we compiled a new version of filter to add support for > the tab character > * wikistats: we will fix the scripts on an ongoing basis. > * udp2log: we have a patch ready for inserting sequence numbers separated > by tab. > > In particular, I would like to have feedback to three questions: > > 1) Are there important reasons not to use tab as field delimiter? > > 2) Are there important pieces of logging that expect a space instead of a > tab and that need to be fixed and that I did not mention in this email? > > 3) Is February 1st a good date to deploy this change? (Assuming that all > preps are finished) > > > Best, > > Diederik > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l