AndrewTavis_WMDE added a comment.
@Manuel, I think we can throw out the idea of creating an edits subset of webrequests, sadly :( The following would be where we'd find the various actions that we'd need to collect to define as edits fully: https://www.wikidata.org/w/api.php. We know at the very least that we'd want `uri_query LIKE '?action=edit%'` and `uri_query LIKE '?action=wbsetclaim%'`, but figuring out what else needs to be added seems to be prohibitive given the discrepancy: SELECT COUNT(*) AS total_edits FROM wmf.webrequest WHERE year = 2023 AND month = 7 AND day = 31 AND uri_host IN ('www.wikidata.org', 'm.wikidata.org') AND ( uri_query LIKE '?action=edit%' OR uri_query LIKE '?action=wbsetclaim%' ) ... gives us `11,947`, and the following: SELECT COUNT(*) AS total_edits FROM wmf_raw.mediawiki_private_cu_changes WHERE wiki_db = 'wikidatawiki' AND month = '2023-07' AND '20230731' <= cuc_timestamp AND cuc_timestamp < '20230801' ... gives us `657,347`, with `11947/657347` being `1.817%`. There definitely should be a combination of those actions that gets us a similar number, but this would be something that we'd need to loop WMF into, and the easiest route would likely be to talk about getting a similar subset to `pageview_actor` as a table in the Data Lake. TASK DETAIL https://phabricator.wikimedia.org/T336361 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org