[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2024-04-08 Thread Manuel
Manuel closed subtask T341330: [Analytics] Airflow implementation of unique ips accessing Wikidatas REST API metrics as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2024-03-14 Thread Manuel
Manuel changed the status of subtask T341330: [Analytics] Airflow implementation of unique ips accessing Wikidatas REST API metrics from Stalled to Open. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-10-17 Thread Manuel
Manuel changed the status of subtask T341330: [Analytics] Airflow implementation of unique ips accessing Wikidatas REST API metrics from Open to Stalled. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-24 Thread Manuel
Manuel moved this task from Needs product input to Needs product sign-off on the Wikidata Analytics (Kanban) board. Manuel closed this task as "Resolved". Manuel added a comment. > The above was still do be done Ah, I see! Yes, I agree, we have talked about this and we are trying out 30

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-24 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. > [ ] Let's discuss what we can do to make the metric more robust and reliable (e.g. exclude browser user agents) The above was still do be done, but I'd say it's finished and this is good to close :) TASK DETAIL

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-24 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-24 Thread Manuel
Manuel added a comment. Are there still open questions that need my input or are we done and I only need to sign off (and close) the task? TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Sure, I'll head over to T341330 and edit the task so it's aligned with what we discussed :) TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-18 Thread Manuel
Manuel added a comment. Thank you! About T341330 : - We last aligned on creating running 30 days only for unique IPs and not for user agents (as they are not robust enough for continuous monitoring). - That means that we will need the

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. It's now the following: > filtering out `user_agent` values that match bot generated versions Let me know if you'd like me to post the query here as well :) TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Hey @Manuel  `126` and `122` are a result of what it is that we aligned on, yes :) I just didn't update the description properly. Will do so now  TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread Manuel
Manuel added a comment. @AndrewTavis_WMDE: Thx for the update! We last aligned on completely removing all user agents that fit the malicious pattern. Are 126 for June and 122 for May the result of this? I am asking because the tasks description says "filtering out **ips** with user_agent

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. @Manuel, the task has now been updated with all the metrics. Took a bit for all the queries to run :) Let me know if you have further thoughts on the final subtask: > [ ] Let's discuss what we can do to make the metric more robust and reliable (e.g.

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-13 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-12 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. For the initial report where we just want to count certain `user_agent` values as one via their ip: - Match it with regular expressions within the SQL as the initial try - Check Python packages that help with `user_agent` values TASK DETAIL

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-12 Thread Manuel
Manuel added a comment. > In talking a bit with folks at the Data Platform Engineering Office Hours the general sentiment was that WMF is currently hashing user_agent and ip for "unique" users, but that there's work to be done here. A hash of User Agent and IP would make no sense in our

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-11 Thread Manuel
Manuel added a comment. > Is the purpose of this metric to define unique users of the API? The purpose of the metric is to approximate unique developers that are using the new API, yes. > If at the end we're reporting unique users and we have the caveat that it's undershooting by

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. In talking a bit with folks at the Data Platform Engineering Office Hours the general sentiment was that WMF is currently hashing `user_agent` and `ip` for "unique" users, but that there's work to be done here. There's a fair amount of entropy in this

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. I'm still a bit confused. The above comment starts with the following: >> A selection of user_agent/ip pairs where each value can only occur once within the pairs > I don't think so: It's only the IP value that should only occur once. So a

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-11 Thread Manuel
Manuel added a comment. > A selection of user_agent/ip pairs where each value can only occur once within the pairs I don't think so: It's only the IP value that should only occur once. So a more robust metric than the original would likely not rely on user agents at all (e.g. unique

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. @Manuel, something else to explore would be to see if we could figure out a metric that links `user_agent` and `ip`. I'm a bit confused why we'd go through this and then still we have tons of unique individuals within Python's requests count. A breakdown:

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Checks that we could do to fix issues with users creating multiple `user_agent` would be: [ ] Switch this metric to counting unique `ip` values [ ] Count unique `user_agent` values, but only if they have a unique `ip` - If they have a non-unique

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread Manuel
Manuel updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, Manuel Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread Manuel
Manuel triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, Manuel Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, Michael,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-07 Thread Manuel
Manuel renamed this task from " [Analytics] Airflow implementation of unique user-agents accessing Wikidata's REST API " to "[Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023". TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-07 Thread Manuel
Manuel added a subtask: T341330: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023. TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, Manuel Cc: mforns,

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023

2023-07-06 Thread Manuel
Manuel renamed this task from " [Analytics] Unique user-agents accessing Wikidata's REST API" to " [Analytics] Unique user-agents accessing Wikidata's REST API for Q2/2023". TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES