WMDE-leszek created this task.
WMDE-leszek added projects: Wikidata, Wikidata Tainted References.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  **Context**: Tainted References feature on Wikidata is intended to make 
mismatched statement value/reference pairs more prominent to Wikidata editors.
  
  **Definitions:**
  
  - tainted reference: mismatching statement value and reference pair
  - edit triggering a tainted reference: edit changing exclusively a value of 
the statement
  - edit cleaning the tainted reference: one of the following
    - edit changing the reference of the statement on which tainted reference 
has been previously triggered.
    - edit removing the reference of the statement on which tainted reference 
has been previously triggered.
    - edit reverting the edit triggering a tainted reference
  
  **Goal**: Fewer mismatching value/reference pairs exist.
  
  We want to measure how many tainted references are **triggered**, and how 
many of these are being **cleaned**.
  To have comparable figures, we need to have a baseline values for the period 
before enabling the new future (baseline does not exist yet)
  
  In the first iteration we only need to look at the next edit by the same 
author, making the data simpler, but we might want to extend this later.
  
  **Goal:** Triggered mismatches do get cleaned up and don’t pile up.
  
  We want to measure how many of tainted references that have been 
**triggered** are eventually **cleaned**, and how long it 
  Again, we would need to compare with a baseline, and this metric is related 
to the previous one (at least conceptually, technically those might be measured 
completely separate)
  
  **Technical considerations**
  
  - Wikibase does not help much to identifying triggering and cleaning edits
  - Edits (Mediawiki revisions) changing a statement in any way (without much 
detail on what has changed: value, reference, qualifier, combination of these) 
could be filtered by considering only revisions with the `comment` field 
containing a value of format `/* wbsetclaim-update:N||N */ [[Property:PNNN]]: 
XYZ`, where N, NNN, and XYZ are actualy numbers/values.
  - further reasoning on what the edit change might only be possible by 
inspecting the change done be the edit (revision), i.e. comparing the JSON 
object representation of an item in before and after
  - For identifying revisions (edits) changing the same statement (e.g. to be 
able to recognize if the tainted reference has been cleaned) relying on 
statements unique ID might be of help. It still likely will be involving 
analyzing the JSON structure of the item data, as the identifier of the 
statement is not exposed in the `comment` or other field.

TASK DETAIL
  https://phabricator.wikimedia.org/T240466

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: WMDE-leszek
Cc: Aklapper, Addshore, Jan_Dittrich, hoo, rosalieper, noarave, Tarrow, 
Lydia_Pintscher, GoranSMilovanovic, WMDE-leszek, Sarai-WMDE, darthmon_wmde, 
DannyS712, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to