dcausse added a comment.

  I believe this problem is similar to what was reported in 
https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Updater_issue_?
  
  My understanding of this problem is as follow:
  
  The wikibase RDF for sitelinks uses the URL of the link as a subject:
  
    <https://en.wikipedia.org/wiki/Eric_Brewer> a schema:Article ;
        schema:about wd:Q1342539 ;
        schema:inLanguage "en" ;
        schema:isPartOf <https://en.wikipedia.org/> ;
        schema:name "Eric Brewer"@en ;
        wikibase:badge wd:Q17437796 .
  
  When altering the site links of an entity only the **link** between the 
wikidata entity and the link subject is removed, in the example above only the 
`<https://en.wikipedia.org/wiki/Eric_Brewer> schema:about wd:Q1342539` meaning 
that the following data:
  
    <https://en.wikipedia.org/wiki/Eric_Brewer> a schema:Article ;
        schema:inLanguage "en" ;
        schema:isPartOf <https://en.wikipedia.org/> ;
        schema:name "Eric Brewer"@en ;
        wikibase:badge wd:Q17437796 .
  
  will remain in blazegraph as orphaned values. Orphaned values are a known 
problem discussed here: T302189 <https://phabricator.wikimedia.org/T302189> 
(TL/DR: removing orphaned values in realtime might be costly).
  
  Here the additional problem is that this orphaned sitelink will get 
re-attached to another entity if the same sitelink is being used in another 
wikidata entity.
  
  I could see three options here:
  
  1. change the wikibase RDF model so that it is less likely that orphaned 
sitelinks are reused (use reification and never promote the sitelink to a 
subject, similar to what's done for complex values and references). This is a 
breaking that is unlikely to be worthwhile.
  2. consider that this problem is not common enough and rely on more regular 
data-reloads and accept it as known limitation of the update process
  3. attempt to cleanup orphaned sitelinks in realtime, might not be entirely 
trivial to do but seems doable, main issue will be related to performances, 
what if the updater performance degrades too much?

TASK DETAIL
  https://phabricator.wikimedia.org/T323239

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Tagishsimon, Aklapper, agray, Astuthiodit_1, AWesterinen, 
karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to