I turns out, this is a little more complicated than it appeared at first; usercontribs and list users have different concepts of "invalid". If you ask for usercontribs on "1.2.3.4", it's valid. If you pass in "1.2.3.0/24", you get baduser.. But list users returns:
{ "batchcomplete": "", "query": { "users": [ { "name": "1.2.3.4", "invalid": "" } ] } } which I guess makes sense in that context since it can't map it to a userid. I can work around this, but mentioning it for the sake of some poor developer searching the archives N years from now trying to figure it out :-) > On Aug 19, 2021, at 6:21 PM, Bryan Davis <bd...@wikimedia.org> wrote: > > On Thu, Aug 19, 2021 at 4:04 PM Roy Smith <r...@panix.com> wrote: >> >> I've got a tool which parses sockpuppet investigation (SPI) pages and does >> some analysis. One of the steps is I need to validate that all of the >> usernames found in the SPI report are valid. I do that by sequentially >> calling usercontribs on each name with uclimit=1 and seeing if I get a >> baduser error. >> >> This works, but it's slow because I need to make 1 API call for each user. >> For a big SPI case, the time to do this swamps everything else. Is there a >> more efficient way to do this? Some API call where I can give it a bunch of >> usernames in a batch and have it tell me which ones are invalid? >> Alternatively, is there a regex I could apply on the client side to test if >> a username is valid? >> >> The most common type of invalid name I see is when somebody puts down an >> iprange (i.e. 1.2.4.0/24) as a username. Testing for that client-side would >> be trivial, but it might miss some others. > > You can do lookups in batches of 50 (500 if you have the > "apihighlimits" right which is commonly granted by the "Bots" group on > movement wikis) with > <https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers>. > > Here's a quick example: > <https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808> > > The results will look something like: > ``` > { > "batchcomplete": true, > "query": { > "users": [ > { > "name": "Bryan Davis", > "missing": true > }, > { > "userid": 2619078, > "name": "BryanDavis" > }, > { > "userid": 19474624, > "name": "BDavis (WMF)" > }, > { > "userid": 24257381, > "name": "Bd808" > } > ] > } > } > ``` > > Bryan > -- > Bryan Davis Technical Engagement Wikimedia Foundation > Principal Software Engineer Boise, ID USA > [[m:User:BDavis_(WMF)]] irc: bd808 > _______________________________________________ > Cloud mailing list -- cloud@lists.wikimedia.org > List information: > https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/ >
_______________________________________________ Cloud mailing list -- cloud@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/