dcausse added a comment.
I believe this problem is similar to what was reported in https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Updater_issue_? My understanding of this problem is as follow: The wikibase RDF for sitelinks uses the URL of the link as a subject: <https://en.wikipedia.org/wiki/Eric_Brewer> a schema:Article ; schema:about wd:Q1342539 ; schema:inLanguage "en" ; schema:isPartOf <https://en.wikipedia.org/> ; schema:name "Eric Brewer"@en ; wikibase:badge wd:Q17437796 . When altering the site links of an entity only the **link** between the wikidata entity and the link subject is removed, in the example above only the `<https://en.wikipedia.org/wiki/Eric_Brewer> schema:about wd:Q1342539` meaning that the following data: <https://en.wikipedia.org/wiki/Eric_Brewer> a schema:Article ; schema:inLanguage "en" ; schema:isPartOf <https://en.wikipedia.org/> ; schema:name "Eric Brewer"@en ; wikibase:badge wd:Q17437796 . will remain in blazegraph as orphaned values. Orphaned values are a known problem discussed here: T302189 <https://phabricator.wikimedia.org/T302189> (TL/DR: removing orphaned values in realtime might be costly). Here the additional problem is that this orphaned sitelink will get re-attached to another entity if the same sitelink is being used in another wikidata entity. I could see three options here: 1. change the wikibase RDF model so that it is less likely that orphaned sitelinks are reused (use reification and never promote the sitelink to a subject, similar to what's done for complex values and references). This is a breaking that is unlikely to be worthwhile. 2. consider that this problem is not common enough and rely on more regular data-reloads and accept it as known limitation of the update process 3. attempt to cleanup orphaned sitelinks in realtime, might not be entirely trivial to do but seems doable, main issue will be related to performances, what if the updater performance degrades too much? TASK DETAIL https://phabricator.wikimedia.org/T323239 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Tagishsimon, Aklapper, agray, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org