Bearloga has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/368462 )
Change subject: metrics::maps: Add maplink+mapframe prevalence metrics ...................................................................... metrics::maps: Add maplink+mapframe prevalence metrics Bug: T170022 Depends-On: I25573e2d552ef7388c83fbbefca6ceab94adacc8 Change-Id: I9a4fc59793d1a1f606781edc17dde04d435e8c8d --- M CHANGELOG.md M README.md M docs/README.md M modules/metrics/maps/config.yaml A modules/metrics/maps/mapframe_prevalence A modules/metrics/maps/maplink_prevalence A modules/metrics/maps/prevalence.R A modules/metrics/maps/prevalence.yaml M test.R 9 files changed, 458 insertions(+), 4 deletions(-) Approvals: Bearloga: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 555e9b3..6a3cece 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,9 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/08/01 +- Added maplink and mapframe prevalence tracking across wikis ([T170022](https://phabricator.wikimedia.org/T170022)) + ## 2017/07/27 - Prepared for Puppetized runs ([T170494](https://phabricator.wikimedia.org/T170494)) diff --git a/README.md b/README.md index a2584ad..47b646f 100644 --- a/README.md +++ b/README.md @@ -49,7 +49,7 @@ c("devtools", "testthat", "Rcpp", "tidyverse", "data.table", "plyr", "optparse", "yaml", "data.tree", - "ISOcodes", "knitr", + "ISOcodes", "knitr", "glue", # For wmf: "urltools", "ggthemes", "pwr", # For polloi's datavis functions: @@ -171,11 +171,15 @@ - [x] GeoFeatures ([T112311](https://phabricator.wikimedia.org/T112311)) - [x] [Actions per tool](modules/metrics/maps/actions_per_tool.sql) - [x] [Users per feature](modules/metrics/maps/users_per_feature.sql) - - [x] Kartographer usage + - [x] Kartotherian usage - [x] [Users by country](modules/metrics/maps/users_by_country) ([T119448](https://phabricator.wikimedia.org/T119448)) - [x] Tile requests ([T113832](https://phabricator.wikimedia.org/T113832)) - [x] [No automata](modules/metrics/maps/tile_aggregates_no_automata) - [x] [With automata](modules/metrics/maps/tile_aggregates_with_automata) + - [ ] Maps prevalence on wikis ([T170022](https://phabricator.wikimedia.org/T170022)) + - [ ] [Maplinks](modules/metrics/maps/maplink_prevalence) + - [ ] [Mapframes](modules/metrics/maps/mapframe_prevalence) + - Kartographer usage (planned) - KPIs (planned) - [x] External Traffic ([configuration](modules/metrics/external_traffic/config.yaml)) - [x] [Referer data](modules/metrics/external_traffic/referer_data) ([T116295](https://phabricator.wikimedia.org/T116295), [Change 247601](https://gerrit.wikimedia.org/r/#/c/247601/)) diff --git a/docs/README.md b/docs/README.md index 4f595f0..f1ea48c 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 26 June 2017 +Last updated on 01 August 2017 Daily Metrics ------------- @@ -34,6 +34,10 @@ level, etc. - **tile\_aggregates\_no\_automata.tsv**: Tile counts by style, zoom level, etc., excluding those made by bots/tools +- **mapframe\_prevalence.tsv**: Proportion of articles on a wiki that + have a mapframe +- **maplink\_prevalence.tsv**: Proportion of articles on a wiki that + have a maplink portal/ ------- diff --git a/modules/metrics/maps/config.yaml b/modules/metrics/maps/config.yaml index d0dae6e..22e1f41 100644 --- a/modules/metrics/maps/config.yaml +++ b/modules/metrics/maps/config.yaml @@ -37,3 +37,15 @@ starts: 2015-12-10 funnel: true type: script + mapframe_prevalence: + description: Proportion of articles on a wiki that have a mapframe + granularity: days + starts: 2017-08-01 # this will need to be set to when patch goes live, we can't backfill this data + funnel: true + type: script + maplink_prevalence: + description: Proportion of articles on a wiki that have a maplink + granularity: days + starts: 2017-08-01 # this will need to be set to when patch goes live, we can't backfill this data + funnel: true + type: script diff --git a/modules/metrics/maps/mapframe_prevalence b/modules/metrics/maps/mapframe_prevalence new file mode 100755 index 0000000..a5fcc75 --- /dev/null +++ b/modules/metrics/maps/mapframe_prevalence @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/maps/prevalence.R -d $1 -o mapframe diff --git a/modules/metrics/maps/maplink_prevalence b/modules/metrics/maps/maplink_prevalence new file mode 100755 index 0000000..1853a30 --- /dev/null +++ b/modules/metrics/maps/maplink_prevalence @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/maps/prevalence.R -d $1 -o maplink diff --git a/modules/metrics/maps/prevalence.R b/modules/metrics/maps/prevalence.R new file mode 100644 index 0000000..0be1f27 --- /dev/null +++ b/modules/metrics/maps/prevalence.R @@ -0,0 +1,85 @@ +#!/usr/bin/env Rscript + +source("config.R") +.libPaths(r_library) +suppressPackageStartupMessages({ + library("optparse") + library("glue") +}) + +option_list <- list( + make_option(c("-d", "--date"), default = NA, action = "store", type = "character", + help = "Warning: this metric cannot be backfilled."), + make_option(c("-o", "--output"), default = NA, action = "store", type = "character", + help = "Available: + * maplink + * mapframe") +) + +# Get command line options, if help option encountered print help and exit, +# otherwise if options not found on command line then set defaults: +opt <- parse_args(OptionParser(option_list = option_list)) + +if (is.na(opt$date) || !(opt$output %in% c("mapframe", "maplink")) ) { + quit(save = "no", status = 1) +} + +enabled <- yaml::yaml.load_file("modules/metrics/maps/prevalence.yaml") + +prevalence_query <- function(type, wiki) { + prop_name <- ifelse(type == "maplink", "kartographer_links", "kartographer_frames") + ns <- ifelse(wiki == "commonswiki", 6, 0) + query <- glue("SELECT + COUNT(*) AS total_articles, + SUM(IF({type}s > 0, 1, 0)) AS {type}_articles, + SUM(COALESCE({type}s, 0)) AS total_{type}s +FROM ( + SELECT + page.page_id, + pp_value AS {type}s + FROM ( + SELECT pp_page, pp_value + FROM page_props + WHERE pp_propname = '{prop_name}' AND pp_value > 0 + ) AS filtered_props + RIGHT JOIN page ON page.page_id = filtered_props.pp_page AND page.page_namespace = {ns} +) joined_tables;") + return(query) +} + +if (opt$output == "mapframe") { + wikis <- c( + enabled$mapframe$wikipedias, + enabled$mapframe$miscellaneous, + setdiff(enabled$maplink$wikivyoages, enabled$mapframe$wikivoyages) + ) +} else { + wikis <- unname(unlist(enabled$maplink)) +} + +# We can keep the ID format since the full name of each wiki +# won't be shown on the dashboard, just daily aggregates but +# we still want to keep a daily raw per-wiki breakdown. +names(wikis) <- wikis + +# Fetch data from MySQL database: +results <- dplyr::bind_rows(lapply(wikis, function(wiki) { + result <- tryCatch( + suppressMessages(wmf::mysql_read( + prevalence_query(type = opt$output, wiki), + wiki + )), + error = function(e) { + return(data.frame()) + } + ) + return(result) +}), .id = "wiki") + +output <- cbind( + date = as.Date(opt$date, "%Y%m%d"), + results[, union("wiki", colnames(results))] +) + +write.table(output, file = "", append = FALSE, sep = "\t", row.names = FALSE, quote = FALSE) + diff --git a/modules/metrics/maps/prevalence.yaml b/modules/metrics/maps/prevalence.yaml new file mode 100644 index 0000000..c67d65a --- /dev/null +++ b/modules/metrics/maps/prevalence.yaml @@ -0,0 +1,340 @@ +mapframe: + wikipedias: # enabled for the following: + - ruwiki + - cawiki + - hewiki + - mkwiki + - frwiki + - fiwiki + - nowiki + - svwiki + - cswiki # as of July 2017 (T171805) + - uawiki # as of August 2017 (T171805) + - euwiki # as of August 2017 (T171805) + - ptwiki # as of August 2017 (T171805) + miscellaneous: # enabled for the following: + - mediawikiwiki + - metawiki + - commonswiki + wikivoyages: # enabled for all *except* the following: + - hewikivoyage # https://phabricator.wikimedia.org/T170976#3471701 +maplink: + miscellaneous: # enabled for the following: + - mediawikiwiki + - metawiki + - commonswiki + wikivyoages: # enabled for the following: + - dewikivoyage + - elwikivoyage + - enwikivoyage + - eswikivoyage + - fawikivoyage + - fiwikivoyage + - frwikivoyage + - hewikivoyage + - itwikivoyage + - nlwikivoyage + - plwikivoyage + - ptwikivoyage + - rowikivoyage + - ruwikivoyage + - svwikivoyage + - ukwikivoyage + - viwikivoyage + - zhwikivoyage + wikipedias: # enabled for the following: + - aawiki + - abwiki + - acewiki + - adywiki + - afwiki + - akwiki + - alswiki + - amwiki + - angwiki + - anwiki + - arcwiki + - arwiki + - arzwiki + - astwiki + - aswiki + - avwiki + - aywiki + - azbwiki + - azwiki + - barwiki + - bat_smgwiki + - bawiki + - bclwiki + - bewiki + - bgwiki + - bhwiki + - biwiki + - bjnwiki + - bmwiki + - bnwiki + - bowiki + - bpywiki + - brwiki + - bswiki + - bugwiki + - bxrwiki + - cawiki + - cbk_zamwiki + - cdowiki + - cebwiki + - cewiki + - chowiki + - chrwiki + - chwiki + - chywiki + - ckbwiki + - cowiki + - crhwiki + - crwiki + - csbwiki + - cswiki + - cuwiki + - cvwiki + - cywiki + - dawiki + - dewiki + - dinwiki + - diqwiki + - dsbwiki + - dtywiki + - dvwiki + - dzwiki + - eewiki + - elwiki + - emlwiki + - enwiki + - eowiki + - eswiki + - etwiki + - euwiki + - extwiki + - fawiki + - ffwiki + - fiu_vrowiki + - fiwiki + - fjwiki + - fowiki + - frpwiki + - frrwiki + - frwiki + - furwiki + - fywiki + - gagwiki + - ganwiki + - gawiki + - gdwiki + - glkwiki + - glwiki + - gnwiki + - gomwiki + - gotwiki + - guwiki + - gvwiki + - hakwiki + - hawiki + - hawwiki + - hewiki + - hifwiki + - hiwiki + - howiki + - hrwiki + - hsbwiki + - htwiki + - huwiki + - hywiki + - hzwiki + - iawiki + - idwiki + - iewiki + - igwiki + - iiwiki + - ikwiki + - ilowiki + - iowiki + - iswiki + - itwiki + - iuwiki + - jamwiki + - jawiki + - jbowiki + - jvwiki + - kaawiki + - kabwiki + - kawiki + - kbdwiki + - kbpwiki + - kgwiki + - kiwiki + - kjwiki + - kkwiki + - klwiki + - kmwiki + - knwiki + - koiwiki + - kowiki + - krcwiki + - krwiki + - kshwiki + - kswiki + - kuwiki + - kvwiki + - kwwiki + - kywiki + - ladwiki + - lawiki + - lbewiki + - lbwiki + - lezwiki + - lgwiki + - lijwiki + - liwiki + - lmowiki + - lnwiki + - lowiki + - lrcwiki + - ltgwiki + - ltwiki + - lvwiki + - maiwiki + - map_bmswiki + - mdfwiki + - mgwiki + - mhrwiki + - mhwiki + - minwiki + - miwiki + - mkwiki + - mlwiki + - mnwiki + - mowiki + - mrjwiki + - mrwiki + - mswiki + - mtwiki + - muswiki + - mwlwiki + - myvwiki + - mywiki + - mznwiki + - nahwiki + - napwiki + - nawiki + - nds_nlwiki + - ndswiki + - newiki + - newwiki + - ngwiki + - nlwiki + - nnwiki + - novwiki + - nowiki + - nrmwiki + - nsowiki + - nvwiki + - nywiki + - ocwiki + - olowiki + - omwiki + - orwiki + - oswiki + - pagwiki + - pamwiki + - papwiki + - pawiki + - pcdwiki + - pdcwiki + - pflwiki + - pihwiki + - piwiki + - plwiki + - pmswiki + - pnbwiki + - pntwiki + - pswiki + - ptwiki + - quwiki + - rmwiki + - rmywiki + - rnwiki + - roa_rupwiki + - roa_tarawiki + - rowiki + - ruewiki + - ruwiki + - rwwiki + - sahwiki + - sawiki + - scnwiki + - scowiki + - scwiki + - sdwiki + - sewiki + - sgwiki + - shwiki + - siwiki + - skwiki + - slwiki + - smwiki + - snwiki + - sowiki + - sqwiki + - srnwiki + - srwiki + - sswiki + - stqwiki + - stwiki + - suwiki + - svwiki + - swwiki + - szlwiki + - tawiki + - tcywiki + - tetwiki + - tewiki + - tgwiki + - thwiki + - tiwiki + - tkwiki + - tlwiki + - tnwiki + - towiki + - tpiwiki + - trwiki + - tswiki + - ttwiki + - tumwiki + - twwiki + - tyvwiki + - tywiki + - udmwiki + - ugwiki + - ukwiki + - urwiki + - uzwiki + - vecwiki + - vepwiki + - vewiki + - viwiki + - vlswiki + - vowiki + - warwiki + - wawiki + - wowiki + - wuuwiki + - xalwiki + - xhwiki + - xmfwiki + - yiwiki + - yowiki + - zawiki + - zeawiki + - zh_classicalwiki + - zh_yuewiki + - zhwiki + - zuwiki diff --git a/test.R b/test.R index adc8312..05267b5 100644 --- a/test.R +++ b/test.R @@ -9,7 +9,7 @@ "devtools", "testthat", "Rcpp", "tidyverse", "data.table", "plyr", "optparse", "yaml", "data.tree", - "knitr", + "knitr", "glue", # For forecasting modules: "bsts", "forecast", "prophet", # For querying, etc.: -- To view, visit https://gerrit.wikimedia.org/r/368462 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9a4fc59793d1a1f606781edc17dde04d435e8c8d Gerrit-PatchSet: 2 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Bearloga <[email protected]> Gerrit-Reviewer: Bearloga <[email protected]> Gerrit-Reviewer: Chelsyx <[email protected]> _______________________________________________ MediaWiki-commits mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
