Mahir256 created this task. Mahir256 added a project: Wikidata. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION T91981 <https://phabricator.wikimedia.org/T91981> was closed without comment in late 2020 (was it out of staleness?) despite the only objections to the issue provided being made over the course of a few days in August 2015. At that time Blazegraph was still maintained and there were between 20 and 21 million items in Wikidata (and possibly a sense of optimism in the air regarding how descriptions on individual items would turn out). Now there are more than 97 million items, due primarily to the imports of scientific articles in particular—with astronomical objects coming later, to boot—and we routinely speak of a potential Blazegraph failure <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Blazegraph_failure_playbook> and the need to seek alternatives to that software. One way that we might forestall a Blazegraph failure without disturbing people is to reduce the amount of excess triples that actually need to be separately stored, and one such place from which triples might be taken out is the set of descriptions. Like it or not, there are certain classes of items that simply **//will not//** get descriptions more imaginative or customized or detailed than the ones which over time have been added to them in different languages. Yet there are users whose entire existence on Wikidata, judging from their edit history, seems to be the addition and maintenance of these repetitive/unimaginative/etc. descriptions, needing to run so many batches of edits just to correct a single letter across millions of items. An automatic description generation mechanism based on language and item class (following a P31/P279+ path, possibly involving a few other selected properties), whose outputs may be adjusted in exactly one place rather than in millions of items separately, would at least free these users of their labors, and would allow us to remove the excess of triples for their corresponding non-automatic but equally repetitive/unimaginative/etc. counterparts. Some classes of items that would dearly benefit from such a thing //immediately// include items for 1. scientific articles (33,000,000+), 2. Wikimedia categories (5,000,000+), 3. Wikimedia templates (~1,000,000), 4. stars (~3,000,000), 5. galaxies (~2,000,000), 6. Unicode characters (~150,000), 7. researchers (200,000+) This is already near //half// the total number of items on Wikidata at the moment, and //there are likely more item classes// that are missing, and //there are likely more items in the noted classes// that will add to the above numbers. **//Note to developers and other maintainers: It is vigorously beseeched//** that this task not be closed as a duplicate of the previous task, since circumstances have significantly changed over the last six and a half years. TASK DETAIL https://phabricator.wikimedia.org/T303677 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mahir256 Cc: Mahir256, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org