Just to clarify, will this affect the stats at
http://dumps.wikimedia.org/other/pagecounts-raw/ ? Changing the format
of that will probably break third party scripts.
--
-bawolff


On Fri, Jan 25, 2013 at 1:41 PM, Diederik van Liere
<dvanli...@wikimedia.org> wrote:
> Apologies for crossposting
>
> Heya,
>
> The Analytics Team is planning to deploy "tab as field delimiter" to
> replace the current space as fielddelimiter on the varnish/squid/nginx
> servers. We would like to do this on February 1st. The reason for this
> change is that we need to have a consistent number of fields in each
> webrequest log line. Right now, some fields contain spaces and that require
> a lot of post-processing cleanup and slows down the generation of reports.
>
> What is affected and maintained by Analytics
>
> * udp-filter already has support for the tab character
> * webstatscollector: we compiled a new version of filter to add support for
> the tab character
> * wikistats: we will fix the scripts on an ongoing basis.
> * udp2log: we have a patch ready for inserting sequence numbers separated
> by tab.
>
> In particular, I would like to have feedback to three questions:
>
> 1) Are there important reasons not to use tab as field delimiter?
>
> 2) Are there important pieces of logging that expect a space instead of a
> tab and that need to be fixed and that I did not mention in this email?
>
> 3) Is February 1st a good date to deploy this change? (Assuming that all
> preps are finished)
>
>
> Best,
>
> Diederik
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to