AndrewTavis_WMDE added a comment.

  @Manuel, I think we can throw out the idea of creating an edits subset of 
webrequests, sadly :( The following would be where we'd find the various 
actions that we'd need to collect to define as edits fully: 
https://www.wikidata.org/w/api.php. We know at the very least that we'd want 
`uri_query LIKE '?action=edit%'` and `uri_query LIKE '?action=wbsetclaim%'`, 
but figuring out what else needs to be added seems to be prohibitive given the 
discrepancy:
  
    
    SELECT
        COUNT(*) AS total_edits
        
    FROM 
        wmf.webrequest
        
    WHERE
        year = 2023
        AND month = 7
        AND day = 31
        AND uri_host IN ('www.wikidata.org', 'm.wikidata.org')
        AND (
            uri_query LIKE '?action=edit%'
            OR uri_query LIKE '?action=wbsetclaim%'
        )
  
  ... gives us `11,947`, and the following:
  
    SELECT
        COUNT(*) AS total_edits
    
    FROM 
        wmf_raw.mediawiki_private_cu_changes
    
    WHERE
        wiki_db = 'wikidatawiki'
        AND month = '2023-07'
        AND '20230731' <= cuc_timestamp
        AND cuc_timestamp < '20230801'
  
  ... gives us `657,347`, with `11947/657347` being `1.817%`. There definitely 
should be a combination of those actions that gets us a similar number, but 
this would be something that we'd need to loop WMF into, and the easiest route 
would likely be to talk about getting a similar subset to `pageview_actor` as a 
table in the Data Lake.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to