Tilman do we have enough data now to turn this off?
Changing the sampling rate is a config change so can easily drop it
down to 0 if that's desired :)
On Mon, Jan 4, 2016 at 9:41 AM, Tilman Bayer wrote:
> Thanks Marcel! Indeed I saw
> https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#Ac
Thanks Marcel! Indeed I saw
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#Access_data_in_Hadoop
a while ago and asked on #wikimedia-analytics whether this approach
might speed up queries for (the previous version of) this schema, the
response was a bit ambiguous. Nevertheless I'm reall
Thanks Tilman,
It makes sense to reduce the sampling rate of the schema for
"Datensparsamkeit and faster queries". However, if you don't specifically
need MySQL, and are fine querying through Hive, we could continue storing
all events at the current 1% rate in Hadoop.
On Mon, Jan 4, 2016 at 11:28
Hi Marcel,
yes, this is to be expected, because the schema is now logging more
kinds of events than before. However, we could reduce the sampling
rate considerably, as JonR and I had already envisaged
(https://phabricator.wikimedia.org/T120292#1854136 ; this got lost a
bit among the other schema c
BTW, MobileWebSectionUsage schema is sending a lot of events since Dec 18,
2015.
It normally would send around 40 events per second, and it's sending around
120 events per second now. It's now the highest throughput schema in EL by
far. Is that expected?
Sorry for using this same thread. If this n