I think as long as you put in a filter so that the minimum pageviews is
maybe 1000, you should be fine privacy wise. I can't speak too much to your
second question.

On Mon, Jul 9, 2018 at 1:59 PM, Amir E. Aharoni <
amir.ahar...@mail.huji.ac.il> wrote:

> Thank you so much! In many countries it's
>
> A couple of questions:
> 1. Are any of the results of this query private? Or can I talk about them
> to people?
> 2. Is anything like this already published anywhere? If it isn't, it may
> be nice to publish such a thing, similarly to Google Zeitgeist.
>
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> ‪“We're living in pieces,
> I want to live in peace.” – T. Moore‬
>
> 2018-07-09 13:19 GMT+03:00 Francisco Dans <fd...@wikimedia.org>:
>
>> Hi Amir,
>>
>> As Tilman has suggested, your best bet is to query the pageview_hourly
>> table. I was going to be lazy and give you a query to just find out the
>> most viewed article for a given country, but then I made a few experiments
>> and this is the query I came up with to generate a list of countries and
>> their respective most viewed articles and view counts. It takes a few
>> minutes to run for a single day, so I'm sure someone here could suggest a
>> better approach.
>>
>> WITH articles_countries AS (
>>>     SELECT country, page_title, sum(view_count) AS views
>>>     FROM pageview_hourly
>>>     WHERE year=2018 AND month=3 AND day=15
>>>     GROUP BY country, page_title
>>> )
>>> SELECT s.country as country, s.page_title as page_title, s.views as views
>>> FROM (
>>>     SELECT max(named_struct('views', views, 'country', country,
>>> 'page_title', page_title)) as s from articles_countries group by country
>>> ) t;
>>
>>
>> Cheers / see you in ZA,
>> Fran
>>
>>
>> On Mon, Jul 9, 2018 at 10:18 AM, Amir E. Aharoni <
>> amir.ahar...@mail.huji.ac.il> wrote:
>>
>>> Hi,
>>>
>>> Is there a way to find what are the most popular articles per country?
>>>
>>> Finding the most popular articles per language is easy with the
>>> Pageviews tool, but languages and countries are of course not the same.
>>>
>>> One thing I tried is going to Turnilo, webrequest_sampled_128, and
>>> filtering by country. But here it gets troublesome:
>>> * Splitting can be done by Uri host, which is *more or less* the
>>> project, or by Uri path, which is *more or less* the article (but see
>>> below), and I couldn't find a convenient way to combine them.
>>> * Mobile (.m.) and desktop hosts are separate. It may actually sometimes
>>> be useful to see differences (or lack thereof) between desktop and mobile,
>>> but combining them is often useful, too. This can probably be done with
>>> regular expressions, but this brings us to the biggest problem:
>>> * Filtering by Uri path would be useful if it didn't have so many paths
>>> for images, beacons, etc. Filtering using the regular expression
>>> "\/wiki\/.+" may be the right thing functionally, but in practice it's very
>>> slow or doesn't work at all.
>>> * I don't know what exactly is logged in webrequest_sampled_128, but the
>>> name hints that it doesn't include everything. A sample may be OK for
>>> countries with a lot of traffic like U.S. or Spain, but for countries with
>>> smaller traffic this may start being a problem.
>>>
>>> Any better ideas?
>>>
>>> Thanks!
>>>
>>> --
>>> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
>>> http://aharoni.wordpress.com
>>> ‪“We're living in pieces,
>>> I want to live in peace.” – T. Moore‬
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>>
>> --
>> *Francisco Dans*
>> Software Engineer, Analytics Team
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>


-- 
*Francisco Dans*
Software Engineer, Analytics Team
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to