Thank you so much! In many countries it's

A couple of questions:
1. Are any of the results of this query private? Or can I talk about them
to people?
2. Is anything like this already published anywhere? If it isn't, it may be
nice to publish such a thing, similarly to Google Zeitgeist.


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

2018-07-09 13:19 GMT+03:00 Francisco Dans <fd...@wikimedia.org>:

> Hi Amir,
>
> As Tilman has suggested, your best bet is to query the pageview_hourly
> table. I was going to be lazy and give you a query to just find out the
> most viewed article for a given country, but then I made a few experiments
> and this is the query I came up with to generate a list of countries and
> their respective most viewed articles and view counts. It takes a few
> minutes to run for a single day, so I'm sure someone here could suggest a
> better approach.
>
> WITH articles_countries AS (
>>     SELECT country, page_title, sum(view_count) AS views
>>     FROM pageview_hourly
>>     WHERE year=2018 AND month=3 AND day=15
>>     GROUP BY country, page_title
>> )
>> SELECT s.country as country, s.page_title as page_title, s.views as views
>> FROM (
>>     SELECT max(named_struct('views', views, 'country', country,
>> 'page_title', page_title)) as s from articles_countries group by country
>> ) t;
>
>
> Cheers / see you in ZA,
> Fran
>
>
> On Mon, Jul 9, 2018 at 10:18 AM, Amir E. Aharoni <
> amir.ahar...@mail.huji.ac.il> wrote:
>
>> Hi,
>>
>> Is there a way to find what are the most popular articles per country?
>>
>> Finding the most popular articles per language is easy with the Pageviews
>> tool, but languages and countries are of course not the same.
>>
>> One thing I tried is going to Turnilo, webrequest_sampled_128, and
>> filtering by country. But here it gets troublesome:
>> * Splitting can be done by Uri host, which is *more or less* the project,
>> or by Uri path, which is *more or less* the article (but see below), and I
>> couldn't find a convenient way to combine them.
>> * Mobile (.m.) and desktop hosts are separate. It may actually sometimes
>> be useful to see differences (or lack thereof) between desktop and mobile,
>> but combining them is often useful, too. This can probably be done with
>> regular expressions, but this brings us to the biggest problem:
>> * Filtering by Uri path would be useful if it didn't have so many paths
>> for images, beacons, etc. Filtering using the regular expression
>> "\/wiki\/.+" may be the right thing functionally, but in practice it's very
>> slow or doesn't work at all.
>> * I don't know what exactly is logged in webrequest_sampled_128, but the
>> name hints that it doesn't include everything. A sample may be OK for
>> countries with a lot of traffic like U.S. or Spain, but for countries with
>> smaller traffic this may start being a problem.
>>
>> Any better ideas?
>>
>> Thanks!
>>
>> --
>> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
>> http://aharoni.wordpress.com
>> ‪“We're living in pieces,
>> I want to live in peace.” – T. Moore‬
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> *Francisco Dans*
> Software Engineer, Analytics Team
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to