Hi Dan, Thanks for sharing! Can you (or somebody else) tell me where the ticket for "*7. [ ] Design UI to organize dashboards built around new data*" is? I'd be interested and I may be able to help.
Jan 2016-12-03 17:38 GMT+01:00 Dan Andreescu <dandree...@wikimedia.org>: > We're starting to wrap up the calendar year, here's what we've > accomplished so far with Wikistats. We're really excited to have some data > in our production Hive database for people to play with. We worked really > hard to clean up and present an intuitive interface to all of mediawiki > history. The results are captured in the tables mentioned below, which > we'll cover more in an upcoming tech talk. Documentation for the project is > here <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake>. > > Our goals so far and progress breakdown: > > 1. [done] Build pipeline to process and analyze *pageview* data > 2. [done] Load pageview data into an *API* > 3. [ ] *Sanitize* pageview data with more dimensions for public > consumption > 4. [ beta] Build pipeline to process and analyze *editing* data > 5. [ beta] Load editing data into an *API* > 6. [ ] *Sanitize* editing data for public consumption > 7. [ ] *Design* UI to organize dashboards built around new data > 8. [ ] Build enough *dashboards* to replace the main functionality > of stats.wikipedia.org > 9. [ ] Officially Replace stats.wikipedia.org with *(maybe) > analytics.wikipedia.org > <http://analytics.wikipedia.org/>* > ***. [ ] Bonus: *replace dumps generation* based on the new data > pipelines > > 4 & 5. Since our last update, we've finished the pipeline that imports > data from mediawiki databases, cleans it up as best as possible, reshapes > it in a analytics-friendly way, and makes it easily queryable. I'm marking > these goals as "beta" because we're still tweaking the algorithm for > performance and productionizing the jobs. This will be completed early > next quarter, but in the meantime we have data for people to play with > internally. Sadly we haven't sanitized it yet so we can't publish it. For > those with internal access: > > * https://pivot.wikimedia.org/#edit-history-test is the full history > across all wikis. It's a bit hard to understand how to slice and dice, so > we will host a tech talk and present it at the January metrics meeting if > we can. > > * In hive, you can access this data in the wmf database, the tables are: > - wmf.mediawiki_history: denormalized full history with this schema > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_history> > - wmf.mediawiki_page_history: the sequence of states of each wiki page > (schema > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_page_history> > ) > - wmf.mediawiki_user_history: the sequence of states of each user > account (schema > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_user_history> > ) > > 6. Sanitizing has not moved forward, as we need DBA time and they've been > overloaded. We will attempt to restart this effort in Q3. > > 7. We have begun the design process, we'll share more about this as we go. > > Our goals and planning for next quarter support us finishing 4, 5, 7, and > 8, so basically putting a UI on top of the data pipeline we have in place, > and updating it weekly. We also hope to have good progress on 6, but > that depends on collaboration with the DBA team and is harder than we > originally imagined. > > And remember, voice your opinions about important reports in the current > Wikistats here: https://www.mediawiki.org/wiki/Analytics/Wikistats/ > DumpReports/Future_per_report (thank you so so much to the many people > who already chimed in). > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > > -- Jan Dittrich UX Design/ User Research Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0)30 219 158 26-0 http://wikimedia.de Imagine a world, in which every single human being can freely share in the sum of all knowledge. That‘s our commitment. Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l