[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2024-01-29 Thread Manuel
Manuel edited projects, added Wikidata Analytics; removed Wikidata Analytics 
(Kanban).

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2024-01-29 Thread Manuel
Manuel claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-11-08 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Your suggestion make more sense now, @Michael :) If we're looking for 
pageviews by users, then we can make the assumption that we don't want 
`api.php` and `load.php` for `uri_path` as views should come from another page. 
When considering traffic and specifically automate traffic for edits, then 
these would instead be included and we need to derive a different method for 
filtration. Appreciate the further explanation!

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-06 Thread Michael
Michael added a comment.


  > in following @Michael's suggestion for looking into all paths that end in 
api.php and load.php we're getting the following:
  > [...]
  > Ultimately neither number went in the right direction as far as `-` as a 
referrer, so that really looks like it's not a suitable method of deriving bot 
traffic.
  
  I'm confused. That suggestion from me wasn't intended as a method to filter 
out bot traffic, but rather I was wondering if you were looking for 
//pageviews//, because then it would make sense to exclude requests that are 
not for a page. If looking at //traffic// is what this is about, then requests 
to `api.php` and `load.php` should maybe be included. Whereas //edits// 
probably need a much narrower filter that is different yet again.  Though I do 
not actually know your objective or have much context at all here.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Michael
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel, in following @Michael's suggestion for looking into all paths that 
end in `api.php` and `load.php` we're getting the following:
  
  URI Path ends in API or load
  
  
  **Population is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/8/2023 and 31/8/2023 that are further `agent_type = 'user'`,   `uri_path 
LIKE '%api.php'` or `'%load.php'` and users with `user_agent_map.os_family != 
'Android'`, `iOS` or `KaiOS` (note we're not including mobile here).
  
  **None**: top 20 only
  
  | referer | total_requests | 
percent_total |
  | --- | -- | 
- |
  | -   | 387513656 
 | 68.6412   |
  | https://fr.wikipedia.org/   | 13414099  
| 2.3761|
  | https://es.wikipedia.org/   | 12762207  
| 2.2606|
  | https://commons.wikimedia.org/  | 9964420  
| 1.765 |
  | https://en.wikipedia.org/   | 8978283  
| 1.5903|
  | https://it.wikipedia.org/   | 5654932  
| 1.0017|
  | https://pl.wikipedia.org/   | 3752052  
| 0.6646|
  | https://cs.wikipedia.org/   | 2627787  
| 0.4655|
  | https://sv.wikipedia.org/   | 2301542  
| 0.4077|
  | https://www.wikidata.org/wiki/Wikidata:Main_Page| 2273591  
| 0.4027|
  | https://mix-n-match.toolforge.org/  | 2261654  
| 0.4006|
  | https://query.wikidata.org/ | 2199928  
| 0.3897|
  | https://id.wikipedia.org/   | 1907086  
| 0.3378|
  | https://vi.wikipedia.org/   | 1828874  
| 0.324 |
  | https://he.wikipedia.org/   | 1419992  
| 0.2515|
  | https://ru.wikipedia.org/   | 1167455  
| 0.2068|
  | https://www.citypopulation.de/  | rMW105864959e1b 
 
 | 0.1875|
  | https://de.wikipedia.org/   | 973956 | 
0.1725|
  | https://www.wikidata.org/wiki/Special:ListDatatypes | 865829 | 
0.1534|
  | https://www.openstreetmap.org/  | 809150 | 
0.1433|
  |
  
  
  
  URI Path doesn't end in API or load
  ---
  
  **Population is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/8/2023 and 31/8/2023 that are further `agent_type = 'user'`,   `uri_path NOT 
LIKE '%api.php'` and `'%load.php'` and users with `user_agent_map.os_family != 
'Android'`, `iOS` or `KaiOS` (note we're not including mobile here).
  
  **None**: top 20 only
  
  | referer | total_requests | 
percent_total |
  | --- | -- | 
- |
  | -   | 155738614 
 | 66.9387   |
  | https://en.wikipedia.org/   | 2087114 

  | 0.8971|
  | https://query.wikidata.org/ | 1851174  
| 0.7957|
  | https://de.wikipedia.org/   | 1438751  
| 0.6184|
  | https://www.google.com/ | 1044597  
| 0.449 |
  | https://commons.wikimedia.org/  | 959052 | 
0.4122|
  | https://ru.wikipedia.org/   | 692874 | 
0.2978|
  | https://www.wikidata.org/wiki/Wikidata:Main_Page| 661535 | 
0.2843|
  | https://www.wikidata.org/wiki/Special:ListDatatypes | 586789 | 
0.2522|
  | https://www.wikidata.org/wiki/Help:Sources  | 554779 | 
0.2385|
  | https://www.wikidata.org/wiki/Special:Random| 504680 | 
0.2169|
  | https://www.wikidata.org/   | 381398 | 
0.1639|
  | https://fr.wikipedia.org/   | 361425 | 
0.1553|
  | https://ca.wikipedia.org/   | 307653 | 
0.1322|
  | https://es.wikipedia.org/   | 262596 | 
0.1129|
  | https://he.wikipedia.org/   

[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Summarizing the above, @Manuel:
  
  - We're generally able to get a good understanding of mobile users via 
looking at `user_agent_map.os_family = 'Android'`, `'iOS'` or `'KaiOS'`
- If we don't have `user_agent_map` as a column, we can create it with a UDF
- Within there still might be spider or automate traffic that we can't 
filter out (see below)
  - Selecting that which does not meet the mobile `user_agent_map.os_family` 
criteria for desktop users is not viable as we're getting lots of spider and 
automate traffic
- Spider can specifically be derived through a UDF, but automated cannot
  - With the above we can leverage the `wmf.pageview_actor` and 
`wmf_raw.mediawiki_private_cu_changes` tables for views and edits respectively
- We're able to subset for mobile given the above method, but as of now 
can't make what I would deem to be an suitable estimate for desktop traffic
- Estimates for mobile will further not be able to explicitly remove 
automate traffic, so they should be seen as overestimates
  - Looking into how to add automate subsets as an option for tables that we'll 
often be using should be explored
  
  As discussed, moving this into review and we can maybe make a follow up 
ticket!

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-05 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-10-05 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-25 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel, as mentioned on Mattermost there as of now doesn't seem to be a good 
way of deriving `agent_type` for those tables that don't have it. We can get 
`spider` through a `UDF`, but `automated` isn't possible at the moment. This 
makes the final division between desktop and API users pretty difficult. An 
idea I had was checking `uri_path = '/w/api.php'`. Some information breakdowns 
for that follow, with the queries being generally the same as those found 
directly above.
  
  URI Path is API
  ---
  
  **Population is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/7/2023 and 31/7/2023 that are further `agent_type = 'user'`,   `uri_path = 
'/w/api.php'` and users with `user_agent_map.os_family != 'Android'`, `iOS` or 
`KaiOS` (note we're not including mobile here).
  
  **None**: top 20 only, and pardon some weird formatting given some values 
matching Phabricator entities
  
  | referer  | total_requests | 
percent_total |
  |  | -- | 
- |
  | -| 376499170  | 
73.161|
  | https://fr.wikipedia.org/| 13406816  | 
2.6052|
  | https://es.wikipedia.org/| 11754926  | 
2.2842|
  | https://commons.wikimedia.org/   | 9527339  | 
1.8513|
  | https://en.wikipedia.org/| 8976046  | 
1.7442|
  | https://it.wikipedia.org/| 5590715  | 
1.0864|
  | https://pl.wikipedia.org/| 3883528  | 
0.7546|
  | https://google.com   | 2974300  | 
0.578 |
  | https://cs.wikipedia.org/| 2425767  | 
0.4714|
  | https://query.wikidata.org/  | 2235925  | 
0.4345|
  | https://sv.wikipedia.org/| rOPUP221836422752 

  | 0.4311|
  | https://vi.wikipedia.org/| 1701025  | 
0.3305|
  | https://id.wikipedia.org/| 1662960 

  | 0.3231|
  | https://he.wikipedia.org/| 1499706  | 
0.2914|
  | https://ru.wikipedia.org/| 1288532  | 
0.2504|
  | https://www.citypopulation.de/   | 1069687  | 
0.2079|
  | https://mix-n-match.toolforge.org/   | 1027218  | 
0.1996|
  | https://de.wikipedia.org/| 907057 | 0.1763  
  |
  | https://www.wikidata.org/wiki/Wikidata:Main_Page | 823336 | 0.16
  |
  | https://www.openstreetmap.org/   | 758953 | 0.1475  
  |
  |
  
  
  
  URI Path is NOT API
  ---
  
  **Population is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/7/2023 and 31/7/2023 that are further `agent_type = 'user'`,   `uri_path != 
'/w/api.php'` and users with `user_agent_map.os_family != 'Android'`, `iOS` or 
`KaiOS` (note we're not including mobile here).
  
  **None**: top 20 only
  
  | referer | total_requests | 
percent_total |
  | --- | -- | 
- |
  | -   | 130860881 
 | 54.2892   |
  | https://en.wikipedia.org/   | 2468775  
| 1.0242|
  | https://www.wikidata.org/wiki/Wikidata:Main_Page| 2101343  
| 0.8718|
  | https://query.wikidata.org/ | 1678340  
| 0.6963|
  | https://de.wikipedia.org/   | 1328203  
| 0.551 |
  | https://www.wikidata.org/wiki/Special:ListDatatypes | 371  
| 0.4611|
  | https://www.google.com/ | 1100472  
| 0.4565|
  | https://www.wikidata.org/wiki/Help:Sources  | 956761 | 
0.3969|
  | https://commons.wikimedia.org/  | 913448 | 
0.379 |
  | https://ru.wikipedia.org/   | 797254 | 
0.3308|
  | https://www.wikidata.org/   | 412737 | 
0.1712|
  | https://fr.wikipedia.org/   | 360681 | 
0.1496|
  | https://www.wikidata.org/wiki/Sp

[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Some general notes on this: as we're working from `wmf.pageview_actor` and 
`wmf_raw.mediawiki_private_cu_changes`, there might be a way to leverage their 
expanded `agent_type` field such that for at least the former we have 
`automated` as an option within `agent_type` :) So for views we can do a more 
distinct division into mobile, desktop and API users by including `agent_type` 
in it. For edits it's a bit more difficult, but maybe there's a way to add in 
`agent_type` via a UDF on anther field.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Here's the above `referer` breakdown for mobile for reference, with the big 
difference being that we have dramatically less `-` requests - good for 
thinking that these are APIs - and have a lot of extension requests:
  
  | referer 
 | total_requests | percent_total |
  | 
 | 
-- | - |
  | -   
 | 51458890  | 28.4922   |
  | 
https://m.wikidata.org/w/load.php?lang=en&modules=ext.wikidata-org.badges... | 
21326320  | 11.8081   |
  | https://fr.m.wikipedia.org/ 
 | 8464411  | 4.6866|
  | https://commons.m.wikimedia.org/
 | 4501551  | 2.4925|
  | https://www.google.com/ 
 | 3691719  | 2.0441|
  | 
https://m.wikidata.org/w/load.php?lang=en&modules=ext.kartographer.styles... | 
2731198  | 1.5122|
  | https://pl.m.wikipedia.org/ 
 | 1897012  | 1.0504|
  | https://cs.m.wikipedia.org/ 
 | 1832498  | 1.0146|
  | 
https://m.wikidata.org/w/load.php?lang=es&modules=ext.wikidata-org.badges... | 
1533461  | 0.8491|
  | https://m.wikidata.org/wiki/Wikidata:Main_Page  
 | 1375528  | 0.7616|
  | https://sv.m.wikipedia.org/ 
 | 1296833  | 0.718 |
  | https://he.m.wikipedia.org/ 
 | 1282667  | 0.7102|
  | https://en.wikipedia.org/   
 | 797882 | 0.4418|
  | https://es.m.wikipedia.org/ 
 | 689222 | 0.3816|
  | https://ar.m.wikipedia.org/ 
 | 551705 | 0.3055|
  | https://el.m.wikipedia.org/ 
 | 541799 | 0.3   |
  | https://m.wikidata.org/w/load.php?lang=en&modules=ext.discussionTools...
 | 539356 | 0.2986|
  | 
https://m.wikidata.org/w/load.php?lang=fr&modules=ext.wikidata-org.badges... | 
523101 | 0.2896|
  | https://christunveiled.org/ 
 | 517855 | 0.2867|
  | https://www.citypopulation.de/  
 | 493944 | 0.2735|
  |
  
  I'll look more into what these `ext` requests are if needed, @Manuel :) Info 
on this:
  
  **Population is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/7/2023 and 31/7/2023 that are further `agent_type = 'user'` and users with 
`user_agent_map.os_family = 'Android'`, `iOS` or `KaiOS` (note we're only 
including mobile here).
  
  Queries:
  

SELECT
referer AS referer

FROM 
wmf.webrequest

WHERE
year = 2023
AND month = 7
AND '2023-07-01' <= dt
AND dt < '2023-08-01'
AND uri_host IN ('www.wikidata.org', 'm.wikidata.org')
AND agent_type = 'user'
AND (
user_agent_map.os_family = 'Android'
OR user_agent_map.os_family = 'iOS'
OR user_agent_map.os_family = 'KaiOS'
)
  
  Aggregation from the above:
  
SELECT
referer AS referer,
COUNT(*) AS total_requests,
ROUND(COUNT(*) / CAST( SUM(COUNT(*)) OVER () AS float) * 100, 4) AS 
percent_total

FROM
df_webrequest_mobile_referer

GROUP BY
referer

ORDER BY
total_requests DESC

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel, re the question of what kind of `referer` values we have for 
"desktop" requests, the following query was used to get the results below.
  
  **Population for the following is**:
  
  All requests from `www.wikidata.org` and `m.wikidata.org` inclusively between 
01/7/2023 and 31/7/2023 that are further `agent_type = 'user'` and users with 
`user_agent_map.os_family != 'Android'`, `iOS` or `KaiOS` (note we're excluding 
mobile as we discussed).
  
  Base query:
  
SELECT
referer AS referer

FROM 
wmf.webrequest

WHERE
uri_host IN ('www.wikidata.org', 'm.wikidata.org')
AND agent_type = 'user'
AND user_agent_map.os_family != 'Android'
AND user_agent_map.os_family != 'iOS'
AND user_agent_map.os_family != 'KaiOS'
  
  Aggregation query from the above:
  
SELECT
referer AS referer,
COUNT(*) AS total_requests,
ROUND(COUNT(*) / CAST( SUM(COUNT(*)) OVER () AS float) * 100, 4) AS 
percent_total

FROM
df_webrequest_desktop_referer

GROUP BY
referer

ORDER BY
total_requests DESC
  
  Leads to:
  
  | referer | total_requests | 
percent_total |
  | --- | -- | 
- |
  | -   | 507360051 
 | 67.1412   |
  | https://fr.wikipedia.org/   | 13767497  
| 1.8219|
  | https://es.wikipedia.org/   | 12022598  
| 1.591 |
  | https://en.wikipedia.org/   | 11444821  
| 1.5145|
  | https://commons.wikimedia.org/  | 10440787  
| 1.3817|
  | https://it.wikipedia.org/   | 5870598  
| 0.7769|
  | https://pl.wikipedia.org/   | 4156941  
| 0.5501|
  | https://query.wikidata.org/ | 3914265  
| 0.518 |
  | https://google.com  | 2974942  
| 0.3937|
  | https://www.wikidata.org/wiki/Wikidata:Main_Page| 2924679  
| 0.387 |
  | https://cs.wikipedia.org/   | 2644586  
| 0.35  |
  | https://sv.wikipedia.org/   | 2273419  
| 0.3009|
  | https://de.wikipedia.org/   | 2235260  
| 0.2958|
  | https://ru.wikipedia.org/   | 2085786  
| 0.276 |
  | https://vi.wikipedia.org/   | 1798759  
| 0.238 |
  | https://he.wikipedia.org/   | 1740702  
| 0.2304|
  | https://id.wikipedia.org/   | 1715304  
| 0.227 |
  | https://www.wikidata.org/wiki/Special:ListDatatypes | 1327742  
| 0.1757|
  | https://www.wikidata.org/wiki/Help:Sources  | 1125794  
| 0.149 |
  | https://www.google.com/ | 1115638  
| 0.1476|
  |
  
  If we're thinking that API requests have blank referrers then this would 
indicate that an overwhelming majority of the requests are from APIs, which 
doesn't sound wrong to me per say. What does jump out is that over a month we 
have more "desktop" requests from the French and Spanish Wikipedias then from 
the English one 🤔
  
  The same process for `referer_class` leads to (so we have an overview of 
this):
  
  | referer_class| total_requests | percent_total |
  |  | -- | - |
  | none | 507360051  | 67.1412   |
  | internal | 232955981  | 30.8281   |
  | external | 10332151  | 1.3673|
  | external (search engine) | 4641025  | 0.6142|
  | unknown  | 328505 | 0.0435|
  | external (media sites)   | 43230  | 0.0057|

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-07 Thread JAllemandou
JAllemandou added a comment.


  > However, my assumption is that when only filtering for agent_type != 
'spider' the population will still include a lot of non-UI hits.
  
  The `agent_type` field currently can take 3 values: `spider`, `automated` and 
`user`. The `spider` one is used when user-agents self define themselves as 
bots, the `automated` one is used when we heuristically define the traffic as 
being automatically generated (big volume), and the rest falls under the `user` 
value. There indeed still is some non-user traffic being flagged as `user`.

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T336361: [Analytics] Identify access from mobile vs. desktop devices

2023-09-01 Thread Manuel
Manuel renamed this task from "[Analytics] Identify access via mobile devices " 
to "[Analytics] Identify access from mobile vs. desktop devices".

TASK DETAIL
  https://phabricator.wikimedia.org/T336361

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: JAllemandou, AndrewTavis_WMDE, Michael, Manuel, Aklapper, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org