Re: [Analytics] "automated" marker added to pageview data

2020-05-18 Thread Nuria Ruiz
Neil: Some of the rules used to identify automated traffic have been used by the community for now couple years. See for example [1] and [2]. For more information you can always ping us. Thanks, Nuria [1] https://tools.wmflabs.org/topviews/faq/#false_positive [2]

Re: [Analytics] "automated" marker added to pageview data

2020-05-13 Thread Robert West
Ah, cool. Thanks a lot for pointing this out, Francisco! It's great that the automated views are separated out now. Thanks! Bob On Thu, May 14, 2020 at 7:19 AM Francisco Dans wrote: > Robert: the pageview tool now also shows automated views, so you can check > that it is indeed traffic

Re: [Analytics] "automated" marker added to pageview data

2020-05-13 Thread Francisco Dans
Robert: the pageview tool now also shows automated views, so you can check that it is indeed traffic detected as unreported bots: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org=all-access=automated=0=latest-90=Main_Page On Thu, May 14, 2020 at 7:14 AM Robert West wrote: > Ah,

Re: [Analytics] "automated" marker added to pageview data

2020-05-13 Thread Robert West
Ah, nice! I noticed that en:Main_Page traffic dropped by 40% as early as April 30, 5 days before Nuria's message. https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org=all-access=user=0=latest-90=Main_Page Just double-checking whether the drop is caused by the change in logging. Thanks!

Re: [Analytics] "automated" marker added to pageview data

2020-05-13 Thread Neil Shah-Quinn
Nuria, Thank you for this update! I'm very excited about this new system. I did notice that there's not much explanation of the particular rules or strategies that are used to identify automated traffic, or a link to the implementing code. I can imagine this might be intentional, to make it

[Analytics] "automated" marker added to pageview data

2020-05-04 Thread Nuria Ruiz
Hello: We have added the 'automated' maker to Wikimedia's pageview data. Up to now pageview agents were classified as 'spider' (self reported bots like 'google bot' or 'bing bot') and 'user'. We have known for a while that some requests classified as 'user' were, in fact, coming from automated