I think as long as you put in a filter so that the minimum pageviews is maybe 1000, you should be fine privacy wise. I can't speak too much to your second question.
On Mon, Jul 9, 2018 at 1:59 PM, Amir E. Aharoni < amir.ahar...@mail.huji.ac.il> wrote: > Thank you so much! In many countries it's > > A couple of questions: > 1. Are any of the results of this query private? Or can I talk about them > to people? > 2. Is anything like this already published anywhere? If it isn't, it may > be nice to publish such a thing, similarly to Google Zeitgeist. > > > -- > Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי > http://aharoni.wordpress.com > “We're living in pieces, > I want to live in peace.” – T. Moore > > 2018-07-09 13:19 GMT+03:00 Francisco Dans <fd...@wikimedia.org>: > >> Hi Amir, >> >> As Tilman has suggested, your best bet is to query the pageview_hourly >> table. I was going to be lazy and give you a query to just find out the >> most viewed article for a given country, but then I made a few experiments >> and this is the query I came up with to generate a list of countries and >> their respective most viewed articles and view counts. It takes a few >> minutes to run for a single day, so I'm sure someone here could suggest a >> better approach. >> >> WITH articles_countries AS ( >>> SELECT country, page_title, sum(view_count) AS views >>> FROM pageview_hourly >>> WHERE year=2018 AND month=3 AND day=15 >>> GROUP BY country, page_title >>> ) >>> SELECT s.country as country, s.page_title as page_title, s.views as views >>> FROM ( >>> SELECT max(named_struct('views', views, 'country', country, >>> 'page_title', page_title)) as s from articles_countries group by country >>> ) t; >> >> >> Cheers / see you in ZA, >> Fran >> >> >> On Mon, Jul 9, 2018 at 10:18 AM, Amir E. Aharoni < >> amir.ahar...@mail.huji.ac.il> wrote: >> >>> Hi, >>> >>> Is there a way to find what are the most popular articles per country? >>> >>> Finding the most popular articles per language is easy with the >>> Pageviews tool, but languages and countries are of course not the same. >>> >>> One thing I tried is going to Turnilo, webrequest_sampled_128, and >>> filtering by country. But here it gets troublesome: >>> * Splitting can be done by Uri host, which is *more or less* the >>> project, or by Uri path, which is *more or less* the article (but see >>> below), and I couldn't find a convenient way to combine them. >>> * Mobile (.m.) and desktop hosts are separate. It may actually sometimes >>> be useful to see differences (or lack thereof) between desktop and mobile, >>> but combining them is often useful, too. This can probably be done with >>> regular expressions, but this brings us to the biggest problem: >>> * Filtering by Uri path would be useful if it didn't have so many paths >>> for images, beacons, etc. Filtering using the regular expression >>> "\/wiki\/.+" may be the right thing functionally, but in practice it's very >>> slow or doesn't work at all. >>> * I don't know what exactly is logged in webrequest_sampled_128, but the >>> name hints that it doesn't include everything. A sample may be OK for >>> countries with a lot of traffic like U.S. or Spain, but for countries with >>> smaller traffic this may start being a problem. >>> >>> Any better ideas? >>> >>> Thanks! >>> >>> -- >>> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי >>> http://aharoni.wordpress.com >>> “We're living in pieces, >>> I want to live in peace.” – T. Moore >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> >> -- >> *Francisco Dans* >> Software Engineer, Analytics Team >> Wikimedia Foundation >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics