As an aside, this may be a case where generators in the api are useful - e.g. https://en.wikipedia.org/w/api.php?action=query&generator=redirects&titles=2019%E2%80%9320_coronavirus_outbreak&prop=pageviews&pvipmetric=pageviews&pvipdays=60 (Note: does not include the actual non-redirect article in the results, and you have to pay close attention to the continue parameters) https://en.wikipedia.org/w/api.php?action=query&generator=redirects&titles=2019%E2%80%9320_coronavirus_outbreak&prop=pageviews&pvipmetric=pageviews&pvipdays=60&grdlimit=max&formatversion=2
On Mon, Feb 24, 2020 at 4:28 AM bawolff <bawolff...@gmail.com> wrote: > Hi, > > When I tested the api it seemed to work with redirects (e.g. > https://mediawiki.org/w/api.php?action=query&format=json&prop=pageviews&titles=MediaWiki%7CMain_Page&pvipmetric=pageviews&pvipdays=60&pvipcontinue= > Where Main_Page redirects to the page MediaWiki ) > > > Then we attempted to use the redirects of a page and using the old page > ids to grab the pageview data > > Just to be clear, when a page is moved, it keeps its page_id. So redirects > may have historically had the page_id that the target page has now. > > If all else fails, you can look at the big dataset files at > https://dumps.wikimedia.org/other/analytics/ . They should be available > (in some form or another) going back to 2007, and I believe they are the > source of the data that the api and all other tools return. > > -- > Brian > > On Mon, Feb 24, 2020 at 12:17 AM James Gardner via Wikitech-l < > wikitech-l@lists.wikimedia.org> wrote: > >> Hi all, >> >> We are a group of undergraduates working on a project using the MediaWiki >> API. While working on this project, we ran into a unique issue involving >> pageviews. When trying to pull pageview data for a particular page, the >> redirects of a page would not be counted along with the original >> pageviews. >> For example, the Hong Kong protests page only has direct views, and not >> views from previous titles. >> >> We attempted to use the wmflabs.org tool, but it only shows data from a >> certain date. (Example link: >> >> https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-07-01&end=2020-01-25&pages=2019%E2%80%9320_Hong_Kong_protests|China >> <https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-07-01&end=2020-01-25&pages=2019%E2%80%9320_Hong_Kong_protests%7CChina> >> < >> https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-07-01&end=2020-01-25&pages=2019%E2%80%9320_Hong_Kong_protests%7CChina >> > >> ) >> >> Then we attempted to use the redirects of a page and using the old page >> ids >> to grab the pageview data, but there was no data returned. When we >> attempted to grab data for a page that we knew would have a long past, but >> the parameter of "pvipcontinue" did not appear ( >> https://www.mediawiki.org/w/api.php?action=help&modules=query%2Bpageviews >> ). >> (Example: >> >> https://www.mediawiki.org/wiki/Special:ApiSandbox#action=query&format=json&prop=pageviews&titles=MediaWiki&pvipmetric=pageviews&pvipdays=60&pvipcontinue= >> ) >> >> In the end, we are trying to get an accurate count of view for a certain >> page no matter the source. >> >> Any guidance or assistance is greatly appreciated. >> >> Thanks, >> Jackie, James, Junyi, Kirby >> _______________________________________________ >> Wikitech-l mailing list >> Wikitech-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l