Claudia asked: > Q: being in none of the special Wikipedia roles, which of these ideas would I > be able to try out by myself?
All the metadata is available through the Wikipedia API, and the Pywikipediabot framework makes a lot of it easily accessible, so if you know how to program in Python, it's doable :) Cheers, Morten On 8 May 2012 10:40, <koltzenb...@w4w.net> wrote: > hi Bináris, Merlijn, Alchimista, and Morten, > > thank you very much > does anyone of you remember hearing a very new type of song, and being > fascinated for sure but not quite > trusting your ears? > > btw, on his talk page yesterday, JAn came up with an idea that sounds like > "new song" to me, too: > http://cs.wikipedia.org/w/index.php?title=Diskuse_s_wikipedistou:JAn_Dudík&diff=8497947&oldid=8497773 > > Morten said >> Hope some of this helps, let me know if there's any questions. > > I guess there are, Morten, thanks :-) > > Q: being in none of the special Wikipedia roles, which of these ideas would I > be able to try out by myself? > > btw, thanks for asking @Morten, > > cheers, > Claudia > > On Tue, 8 May 2012 10:01:23 -0500, Morten Wang wrote >> I did some data gathering last fall that is more or less the same as >> Claudia is asking about. Looking up the bot flag, or checking the >> username is often regarded as a reasonable way of filtering out the >> bots. I chose to apply both, if there's no bot flag we look for a >> typical bot signature in the username (regex: "bot$| ", username >> either ends with bot or a part of it does), and used a >> case-insensitive match since some users have usernames like "FoObOt". >> >> Checking the edit history to find when interwiki links were first >> added can be time-consuming if the page had lots of activity. I >> therefore chose to use a binary search, halving the distance between >> two test points until either the actual edit is found, or we're down >> to so few edits that all can be efficiently grabbed through the API >> (e.g. using Pywikibot's PreloadingGenerator). Otherwise you might be >> examining thousands of edits for no reason. >> >> Having Toolserver access simplifies the process a lot since all the >> metadata is more easily accessible, but the revision text will still >> have to be grabbed from the API. >> >> Hope some of this helps, let me know if there's any questions. >> >> Cheers, >> Morten >> >> On 8 May 2012 08:39, Bináris <wikipo...@gmail.com> wrote: >> > 2012/5/8 Merlijn van Deen <valhall...@arctus.nl> >> >> >> >> >> >> This is not completely true - the bot flag is also a property of the >> >> user account. You can query e.g. >> >> >> >> http://nl.wikipedia.org/w/index.php? > title=Speciaal:Gebruikerslijst&offset=&limit=500&group=bot&uselang=en >> >> >> > >> > Yes, that's true. And if you want to be quite accurate, you must also >> > determine the date of acquiring the bot flag from bureau logs and compare >> > it >> > to the page history. :-) >> > >> > -- >> > Bináris >> > >> > _______________________________________________ >> > Pywikipedia-l mailing list >> > Pywikipedia-l@lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l >> > >> >> _______________________________________________ >> Pywikipedia-l mailing list >> Pywikipedia-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l > > > thanks & cheers, > Claudia > koltzenb...@w4w.net > > > _______________________________________________ > Pywikipedia-l mailing list > Pywikipedia-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l _______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l