Hi Dan,

Thanks for sharing!
Can you (or somebody else) tell me where the ticket for "*7. [
 ] Design UI to organize dashboards built around new data*" is?
I'd be interested and I may be able to help.

Jan

2016-12-03 17:38 GMT+01:00 Dan Andreescu <dandree...@wikimedia.org>:

> We're starting to wrap up the calendar year, here's what we've
> accomplished so far with Wikistats.  We're really excited to have some data
> in our production Hive database for people to play with.  We worked really
> hard to clean up and present an intuitive interface to all of mediawiki
> history.  The results are captured in the tables mentioned below, which
> we'll cover more in an upcoming tech talk.  Documentation for the project is
> here <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake>.
>
> Our goals so far and progress breakdown:
>
> 1. [done] Build pipeline to process and analyze *pageview* data
> 2. [done] Load pageview data into an *API*
> 3. [        ] *Sanitize* pageview data with more dimensions for public
> consumption
> 4. [ beta] Build pipeline to process and analyze *editing* data
> 5. [ beta] Load editing data into an *API*
> 6. [        ] *Sanitize* editing data for public consumption
> 7. [        ] *Design* UI to organize dashboards built around new data
> 8. [        ] Build enough *dashboards* to replace the main functionality
> of stats.wikipedia.org
> 9. [        ] Officially Replace stats.wikipedia.org with *(maybe) 
> analytics.wikipedia.org
> <http://analytics.wikipedia.org/>*
> ***. [         ] Bonus: *replace dumps generation* based on the new data
> pipelines
>
> 4 & 5.  Since our last update, we've finished the pipeline that imports
> data from mediawiki databases, cleans it up as best as possible, reshapes
> it in a analytics-friendly way, and makes it easily queryable.  I'm marking
> these goals as "beta" because we're still tweaking the algorithm for
> performance and productionizing the jobs.  This will be completed early
> next quarter, but in the meantime we have data for people to play with
> internally.  Sadly we haven't sanitized it yet so we can't publish it.  For
> those with internal access:
>
> * https://pivot.wikimedia.org/#edit-history-test is the full history
> across all wikis.  It's a bit hard to understand how to slice and dice, so
> we will host a tech talk and present it at the January metrics meeting if
> we can.
>
> * In hive, you can access this data in the wmf database, the tables are:
>     - wmf.mediawiki_history: denormalized full history with this schema
> <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_history>
>     - wmf.mediawiki_page_history: the sequence of states of each wiki page
> (schema
> <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_page_history>
> )
>     - wmf.mediawiki_user_history: the sequence of states of each user
> account (schema
> <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Mediawiki_user_history>
> )
>
> 6.  Sanitizing has not moved forward, as we need DBA time and they've been
> overloaded.  We will attempt to restart this effort in Q3.
>
> 7.  We have begun the design process, we'll share more about this as we go.
>
> Our goals and planning for next quarter support us finishing 4, 5, 7, and
> 8, so basically putting a UI on top of the data pipeline we have in place,
> and updating it weekly.  We also hope to have good progress on 6, but
> that depends on collaboration with the DBA team and is harder than we
> originally imagined.
>
> And remember, voice your opinions about important reports in the current
> Wikistats here: https://www.mediawiki.org/wiki/Analytics/Wikistats/
> DumpReports/Future_per_report  (thank you so so much to the many people
> who already chimed in).
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>


-- 
Jan Dittrich
UX Design/ User Research

Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
http://wikimedia.de

Imagine a world, in which every single human being can freely share in the
sum of all knowledge. That‘s our commitment.

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to