matthiasmullie added a comment.
I don't really know much about how `mediawiki_history` gets populated, but `wbcreateclaim` & `wbeditentity` (in addition to `wbsetclaim`) could also be used to create (or edit) statements IIRC, so this last set of results is likely incomplete. And I believe `wbc_entity_usage` currently *only* gets populated with M-entities fetched via Lua - any page where the M-entity is not used, will not show up there (I believe - could be wrong). So that data too (in additional to containing other, non-MediaInfo related entities) is likely incomplete. I believe that this query (based on @addshore's, but more strict about including only latest revision, of pages that have not been archived) is quite accurate (takes an awful long time to complete though) Did I overlook anything here - any reason to believe this number is invalid? SELECT COUNT(DISTINCT page_id) # page excludes deleted pages (which are in archive) FROM page # joining on page_latest - we only care about most recent (not revdeleted) revision INNER JOIN revision ON rev_id = page_latest AND rev_deleted = 0 INNER JOIN slots ON slot_revision_id = rev_id # mediainfo slot must contain actual content INNER JOIN content ON slot_content_id = content_id AND content_size > 122 INNER JOIN slot_roles ON role_id = slot_role_id AND role_name = 'mediainfo'; +-------------------------+ | COUNT(DISTINCT page_id) | +-------------------------+ | 3004300 | +-------------------------+ 1 row in set (33 min 31.86 sec) Just passed 3M files! TASK DETAIL https://phabricator.wikimedia.org/T238878 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: matthiasmullie Cc: nettrom_WMF, Ladsgroup, daniel, Mayakp.wiki, gsingers, matthiasmullie, Addshore, kzimmerman, mpopov, Ramsey-WMF, Abit, Nuria, 4748kitoko, darthmon_wmde, DannyS712, Nandana, JKSTNK, Akovalyov, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Salgo60, Silverfish, _jensen, rosalieper, Scott_WUaS, Susannaanas, JAllemandou, Jane023, terrrydactyl, Wikidata-bugs, Base, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331, jeremyb
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs