matthiasmullie added a comment.

  I don't really know much about how `mediawiki_history` gets populated, but 
`wbcreateclaim` & `wbeditentity` (in addition to `wbsetclaim`) could also be 
used to create (or edit) statements IIRC, so this last set of results is likely 
incomplete.
  
  And I believe `wbc_entity_usage` currently *only* gets populated with 
M-entities fetched via Lua - any page where the M-entity is not used, will not 
show up there (I believe - could be wrong). So that data too (in additional to 
containing other, non-MediaInfo related entities) is likely incomplete.
  
  I believe that this query (based on @addshore's, but more strict about 
including only latest revision, of pages that have not been archived) is quite 
accurate (takes an awful long time to complete though)
  Did I overlook anything here - any reason to believe this number is invalid?
  
    SELECT COUNT(DISTINCT page_id)
    # page excludes deleted pages (which are in archive)
    FROM page
    # joining on page_latest - we only care about most recent (not revdeleted) 
revision
    INNER JOIN revision ON rev_id = page_latest AND rev_deleted = 0
    INNER JOIN slots ON slot_revision_id = rev_id
    # mediainfo slot must contain actual content
    INNER JOIN content ON slot_content_id = content_id AND content_size > 122
    INNER JOIN slot_roles ON role_id = slot_role_id AND role_name = 'mediainfo';
  
    +-------------------------+
    | COUNT(DISTINCT page_id) |
    +-------------------------+
    |                 3004300 |
    +-------------------------+
    1 row in set (33 min 31.86 sec)
  
  Just passed 3M files!

TASK DETAIL
  https://phabricator.wikimedia.org/T238878

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: matthiasmullie
Cc: nettrom_WMF, Ladsgroup, daniel, Mayakp.wiki, gsingers, matthiasmullie, 
Addshore, kzimmerman, mpopov, Ramsey-WMF, Abit, Nuria, 4748kitoko, 
darthmon_wmde, DannyS712, Nandana, JKSTNK, Akovalyov, Lahi, PDrouin-WMF, Gq86, 
E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, 
Tramullas, Acer, LawExplorer, Salgo60, Silverfish, _jensen, rosalieper, 
Scott_WUaS, Susannaanas, JAllemandou, Jane023, terrrydactyl, Wikidata-bugs, 
Base, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, 
Steinsplitter, Mbch331, jeremyb
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to