Manuel added a comment.

  My assumption would be that we need to use `page_namespace = 0` and 
`page_is_redirect = False`.  Using the April snapshot, this gives us 
101.785.388 which is very close to what we get on the Grafana data 
<https://grafana.wikimedia.org/d/000000167/wikidata-datamodel?orgId=1&refresh=30m&viewPanel=3&inspect=3&inspectTab=data>
 (101.777.563 for 30. April 2023). So I believe this number should be fine.
  
    query = """
    SELECT 
      page_namespace AS namespace,
      page_is_redirect AS redirect,
      COUNT(*) AS count
    FROM 
        wmf_raw.mediawiki_page
    WHERE 
        wiki_db = 'wikidatawiki'
        AND snapshot = '2023-04'
        AND
            (page_namespace = 0
            OR page_namespace = 1
            OR page_namespace = 120
            OR page_namespace = 146
            OR page_namespace = 640)
    
    GROUP BY page_namespace, page_is_redirect
    """
    df = wmf.presto.run(commands=query)
    df.head()
  
  gives me
  
        namespace       redirect        count
    0   120     False   10977
    1   146     False   1077831
    2   0       True    4031149
    3   0       False   101785388
    4   1       True    52
    5   146     True    13533
    6   1       False   36683
    7   640     False   367

TASK DETAIL
  https://phabricator.wikimedia.org/T337245

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Lydia_Pintscher, Aklapper, Manuel, AndrewTavis_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to