AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata, Wikidata Analytics (Kanban).
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Purpose
  -------
  
  In T362849: [Analytics] Segments of Wikidata's data over time 
<https://phabricator.wikimedia.org/T362849> we need to calculate historical 
segments of Wikidata's items based on their relation to sitelinks.
  
  Purpose from that ticket:
  
  > As Wikidata Product Managers, we would like to understand how different 
segments of Wikidata's data developed over time, so we can inform our 
projections.
  
  This task would encompass the historical data that's needed to achieve this.
  
  Scope
  -----
  
  From T362849 <https://phabricator.wikimedia.org/T362849>:
  
  > How did the number of Items of the following types develop over time?
  >
  >   A) Items that contain a sitelink to one of the Wikimedia projects (e.g. 
about a notable person)
  >   B) Items that are needed to build A (used in A Items for example in a 
statement or reference; e.g. the non-notable father of that notable person)
  >   C) All other Items
  
  
  
  - In order to do this, T363451: Add job to create Wikidata partition to 
wmf.mediawiki_wikitext_history <https://phabricator.wikimedia.org/T363451> was 
made to recreate the Wikidata partition of wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  - Once this task is complete, work can then begin to use this partition to 
generate all data from when Wikidata was created to the most recent weekly data 
generated by the DAG created in T362849 
<https://phabricator.wikimedia.org/T362849>
  
  Desired Output
  --------------
  
  - Weekly stats of the number of Items in category A, B and C
  
  Acceptance criteria:
  
  [ ] Weekly historical breakdowns of populations A, B and C
    - These would be in the Data Lake and the published datasets
  
  ---
  
  **Information below this point is filled out by the Wikidata Analytics team.**
  
  General Planning
  ----------------
  
  Information is filled out by the analytics product manager.
  
  Assignee Planning
  -----------------
  
  Information is filled out by the assignee of this task.
  
  Estimation
  ----------
  
  Estimate:
  Actual:
  
  Sub Tasks
  ---------
  
  Full breakdown of the steps to complete this task:
  
  [ ] Step
  
  Data to be used
  ---------------
  
  See Analytics/Data_Lake 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake> for the breakdown of 
the data lake databases and tables.
  
  The following tables will be referenced in this task:
  
  - wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  
  Notes and Questions
  -------------------
  
  Things that came up during the completion of this task, questions to be 
answered and follow up tasks:
  
  - Note

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to