On Tue, Aug 11, 2015 at 12:29 PM, Jon Katz <jk...@wikimedia.org> wrote:
> However, it seems that >90% of the clicks are coming from the article > table (or adding search created bloat) and > MobileWebUIClickTracking_10742159 is now approaching 300gb. Mostly this is > due to search. I would encourage further sampling, but that would mean that > beta data would be lost. Perhaps we can split it into separate beta/stable > tables and then sample stable? Any other ideas? > Add a samplingRatio field to the schema, add a PHP global to control sampling ratio, set it via operations/mediawiki-config appropriately for each site, in the SQL query used for the dashboards replace count(*) with sum(event_samplingRatio). We did that for MediaViewer and it worked great. Also if your main concern is table size (for us it was mainly server load), you can just run a script periodically to replace the user agent and the URL with an empty string. Those probably take up most of the storage space, every other field is fairly short.
_______________________________________________ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l