Claudia asked:
> Q: being in none of the special Wikipedia roles, which of these ideas would I 
> be able to try out by myself?

All the metadata is available through the Wikipedia API, and the
Pywikipediabot framework makes a lot of it easily accessible, so if
you know how to program in Python, it's doable :)


Cheers,
Morten

On 8 May 2012 10:40,  <koltzenb...@w4w.net> wrote:
> hi Bináris, Merlijn, Alchimista, and Morten,
>
> thank you very much
> does anyone of you remember hearing a very new type of song, and being 
> fascinated for sure but not quite
> trusting your ears?
>
> btw, on his talk page yesterday, JAn came up with an idea that sounds like 
> "new song" to me, too:
> http://cs.wikipedia.org/w/index.php?title=Diskuse_s_wikipedistou:JAn_Dudík&diff=8497947&oldid=8497773
>
> Morten said
>> Hope some of this helps, let me know if there's any questions.
>
> I guess there are, Morten, thanks :-)
>
> Q: being in none of the special Wikipedia roles, which of these ideas would I 
> be able to try out by myself?
>
> btw, thanks for asking @Morten,
>
> cheers,
> Claudia
>
> On Tue, 8 May 2012 10:01:23 -0500, Morten Wang wrote
>> I did some data gathering last fall that is more or less the same as
>> Claudia is asking about.  Looking up the bot flag, or checking the
>> username is often regarded as a reasonable way of filtering out the
>> bots.  I chose to apply both, if there's no bot flag we look for a
>> typical bot signature in the username (regex: "bot$| ", username
>> either ends with bot or a part of it does), and used a
>> case-insensitive match since some users have usernames like "FoObOt".
>>
>> Checking the edit history to find when interwiki links were first
>> added can be time-consuming if the page had lots of activity. I
>> therefore chose to use a binary search, halving the distance between
>> two test points until either the actual edit is found, or we're down
>> to so few edits that all can be efficiently grabbed through the API
>> (e.g. using Pywikibot's PreloadingGenerator). Otherwise you might be
>> examining thousands of edits for no reason.
>>
>> Having Toolserver access simplifies the process a lot since all the
>> metadata is more easily accessible, but the revision text will still
>> have to be grabbed from the API.
>>
>> Hope some of this helps, let me know if there's any questions.
>>
>> Cheers,
>> Morten
>>
>> On 8 May 2012 08:39, Bináris <wikipo...@gmail.com> wrote:
>> > 2012/5/8 Merlijn van Deen <valhall...@arctus.nl>
>> >>
>> >>
>> >> This is not completely true - the bot flag is also a property of the
>> >> user account. You can query e.g.
>> >>
>> >> http://nl.wikipedia.org/w/index.php?
> title=Speciaal:Gebruikerslijst&offset=&limit=500&group=bot&uselang=en
>> >>
>> >
>> > Yes, that's true. And if you want to be quite accurate, you must also
>> > determine the date of acquiring the bot flag from bureau logs and compare 
>> > it
>> > to the page history. :-)
>> >
>> > --
>> > Bináris
>> >
>> > _______________________________________________
>> > Pywikipedia-l mailing list
>> > Pywikipedia-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>> >
>>
>> _______________________________________________
>> Pywikipedia-l mailing list
>> Pywikipedia-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>
> thanks & cheers,
> Claudia
> koltzenb...@w4w.net
>
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to