AndrewTavis_WMDE added a comment.

  @Manuel, something else to explore would be to see if we could figure out a 
metric that links `user_agent` and `ip`. I'm a bit confused why we'd go through 
this and then still we have tons of unique individuals within Python's requests 
count. A breakdown:
  
  - User has unique `user_agent` and `ip`
    - All's fine - select the `user_agent`/`ip` pair
  - User has unique `user_agent` and multiple `ip` values
    - We're assuming that the `ip` value was changed by their provider and 
select one `user_agent`/`ip` pair
  - User has a unique `ip` but is using multiple `user_agent` values
    - We select one of the `user_agent`/`ip` pairs assuming that it's unique 
and that the others are generated repeats
  
  This basically boils down to:
  
  - A selection of `user_agent`/`ip` pairs where each value can only occur once 
within the pairs
  
  For the above we'd get a few extra values based on those users who are 
generating and getting their `ip` switched, but then this might be better than 
dramatically undershooting our total user base? Yes it changes the metric, but 
the goal is an estimation of usage that the Python requests `user_agent` is 
masking.

TASK DETAIL
  https://phabricator.wikimedia.org/T334558

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mforns, xcollazo, Ottomata, lbowmaker, WMDE-leszek, AndrewTavis_WMDE, 
Michael, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to