Hi Ben,

pageview data is loaded daily in a cassandra backend, usually between 1am
and 3am UTC depending on our cluster resource availability.
Around those hours, it is possible that you see some pageviews being loaded
while others are not yet.

Another possible reason for the discrepancy you experience is caching: the
AQS api is behind varnish cache, and it is possible that if you repeat
queries with the same parameters an "old" (by at most 4 hours) result could
be sent.

As for your question, we don't have a public place where the latest loaded
day for AQS is shown - it would be a nice addition!

Best
Joseph




On Mon, Apr 18, 2022 at 9:35 PM Ben Smith <b...@predata.com> wrote:

> Hello all,
>
> We use the Wikimedia AQS Pageviews REST API: [Analytics/AQS/Pageviews - 
> Wikitech](https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews). When 
> making requests for pageviews counts by article, we have noticed that not all 
> data for all pages will exist for the latest day at the same time. Some pages 
> appear to be updated later than others. Is there a place we can check (i.e. a 
> status page or dump files) to determine whether all pageview data is 
> accessible for the latest day via the AQS Pageviews REST API?
>
> Best,
> Ben
>
> _______________________________________________
> Analytics mailing list -- analytics@lists.wikimedia.org
> To unsubscribe send an email to analytics-le...@lists.wikimedia.org
>


-- 
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation
_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-le...@lists.wikimedia.org

Reply via email to