Nikki added a comment.
In T303677#9035100 <https://phabricator.wikimedia.org/T303677#9035100>, @tfmorris wrote: > I'm surprised that this hasn't received any attention in 15 months. As an update to @Nikki 's numbers <https://phabricator.wikimedia.org/T303677#7789434> there are now on the order of 2.5 **BILLION** of these bot generated descriptions. The top 5 alone represent over 2 billion triples. That's a huge waste of resources! What exactly are you counting? (You don't seem to be counting the same thing as me, so they can't be directly compared) I tried redoing my queries (and saved the URLs this time...): | Item | Matching descriptions (March 2022) | Matching descriptions (August 2023) | | | chemical compound (Q11173) <https://www.wikidata.org/wiki/Q11173> | 22,436,766 | 38,777,020 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/brd2m3> | | encyclopedia article (Q13433827) <https://www.wikidata.org/wiki/Q13433827> | 9,877,236 | 10,056,470 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/XEKsXN> | | galaxy (Q318) <https://www.wikidata.org/wiki/Q318> | 14,615,397 | 16,149,120 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/5KVd9G> | | protein (Q8054) <https://www.wikidata.org/wiki/Q8054> | 1,116,867 | 1,155,777 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/FayO41> | | scholarly article (Q13442814) <https://www.wikidata.org/wiki/Q13442814> | 778,351,557 | 813,567,636 | query <https://query.wikidata.org/#select%20%28uri%28concat%28%0A%20%20%22https%3A%2F%2Fquery.wikidata.org%2F%23%22%2C%0A%20%20encode_for_uri%28concat%28%22select%20%28sum%28%3Fc%29%20as%20%3Ft%29%20%7B%5Cn%23%22%2C%20group_concat%28%3Fq%3B%20separator%3D%22%22%29%2C%20%22%5Cn%7D%22%29%29%0A%29%29%20as%20%3Furl%29%20%7B%0A%20%20wd%3AQ13442814%20rdfs%3Alabel%20%3Fdesc.%0A%20%20bind%28concat%28%22union%5Cn%7B%20select%20%28count%28%2a%29%20as%20%3Fc%29%20%7B%20%5B%5D%20schema%3Adescription%20%5C%22%22%2C%20%3Fdesc%2C%20%22%5C%22%40%22%2C%20lang%28%3Fdesc%29%2C%20%22%7D%20%7D%20%22%29%20as%20%3Fq%29.%0A%7D%0A> | | star (Q523) <https://www.wikidata.org/wiki/Q523> | 943,976 | 1,179,311 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/ivh7Ur> | | Unicode character (Q29654788) <https://www.wikidata.org/wiki/Q29654788> | 594,869 | 1,264,561 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/7GFJx0> | | Wikimedia category (Q4167836) <https://www.wikidata.org/wiki/Q4167836> | 495,506,461 | 471,340,460 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/RVvfGG> | | Wikimedia disambiguation page (Q4167410) <https://www.wikidata.org/wiki/Q4167410> | 77,473,195 | 78,644,158 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/zLPPzr> | | Wikimedia list article (Q13406463) <https://www.wikidata.org/wiki/Q13406463> | 17,270,013 | 17,383,921 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/oqeKbt> | | Wikimedia template (Q11266439) <https://www.wikidata.org/wiki/Q11266439> | 67,869,856 | 66,668,772 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/gYEAZZ> | | Wikinews article (Q17633526) <https://www.wikidata.org/wiki/Q17633526> | 12,994,976 | 12,854,826 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/Z9XGJ1> | | family name (Q101352) <https://www.wikidata.org/wiki/Q101352> | | 48,959,524 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/PQ0Ler> | | given name (Q202444) <https://www.wikidata.org/wiki/Q202444> | | 207,184 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/lINUJP> | | female given name (Q11879590) <https://www.wikidata.org/wiki/Q11879590> | | 1,634,583 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/Htj0HC> | | male given name (Q12308941) <https://www.wikidata.org/wiki/Q12308941> | | 2,793,746 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/fNTPaY> | | unisex given name (Q3409032) <https://www.wikidata.org/wiki/Q3409032> | | 58,843 | QLever <https://qlever.cs.uni-freiburg.de/wikidata/qVbzM> | | Even QLever can't count the scholarly article descriptions, so I had to write a query to generate a query that counts each label separately. The number of descriptions matching the labels of the original 12 items went up by 30 million, which still rounds to 1.5 billion. Categories went down 24 million (lots of merges?), but chemical compound went up 16 million and scholarly article went up 35 million. TASK DETAIL https://phabricator.wikimedia.org/T303677 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Nikki Cc: tfmorris, AndrewTavis_WMDE, Fuzheado, valerio.bozzolan, Lectrician1, waldyrious, Michael, DVrandecic, Bugreporter, Manuel, Nikki, Epidosis, Mahir256, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org