I turns out, this is a little more complicated than it appeared at first;  
usercontribs and list users have different concepts of "invalid".  If you ask 
for usercontribs on "1.2.3.4", it's valid.  If you pass in "1.2.3.0/24", you 
get baduser..  But list users returns:

{
    "batchcomplete": "",
    "query": {
        "users": [
            {
                "name": "1.2.3.4",
                "invalid": ""
            }
        ]
    }
}

which I guess makes sense in that context since it can't map it to a userid.  I 
can work around this, but mentioning it for the sake of some poor developer 
searching the archives N years from now trying to figure it out :-)


> On Aug 19, 2021, at 6:21 PM, Bryan Davis <bd...@wikimedia.org> wrote:
> 
> On Thu, Aug 19, 2021 at 4:04 PM Roy Smith <r...@panix.com> wrote:
>> 
>> I've got a tool which parses sockpuppet investigation (SPI) pages and does 
>> some analysis.  One of the steps is I need to validate that all of the 
>> usernames found in the SPI report are valid.  I do that by sequentially 
>> calling usercontribs on each name with uclimit=1 and seeing if I get a 
>> baduser error.
>> 
>> This works, but it's slow because I need to make 1 API call for each user.  
>> For a big SPI case, the time to do this swamps everything else.  Is there a 
>> more efficient way to do this?  Some API call where I can give it a bunch of 
>> usernames in a batch and have it tell me which ones are invalid?  
>> Alternatively, is there a regex I could apply on the client side to test if 
>> a username is valid?
>> 
>> The most common type of invalid name I see is when somebody puts down an 
>> iprange (i.e. 1.2.4.0/24) as a username.  Testing for that client-side would 
>> be trivial, but it might miss some others.
> 
> You can do lookups in batches of 50 (500 if you have the
> "apihighlimits" right which is commonly granted by the "Bots" group on
> movement wikis) with
> <https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers>.
> 
> Here's a quick example:
> <https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808>
> 
> The results will look something like:
> ```
> {
>    "batchcomplete": true,
>    "query": {
>        "users": [
>            {
>                "name": "Bryan Davis",
>                "missing": true
>            },
>            {
>                "userid": 2619078,
>                "name": "BryanDavis"
>            },
>            {
>                "userid": 19474624,
>                "name": "BDavis (WMF)"
>            },
>            {
>                "userid": 24257381,
>                "name": "Bd808"
>            }
>        ]
>    }
> }
> ```
> 
> Bryan
> -- 
> Bryan Davis              Technical Engagement      Wikimedia Foundation
> Principal Software Engineer                               Boise, ID USA
> [[m:User:BDavis_(WMF)]]                                      irc: bd808
> _______________________________________________
> Cloud mailing list -- cloud@lists.wikimedia.org
> List information: 
> https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
> 

_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to