Thanks again for all the feedback. Due to my limits on time, I went ahead
and submitted a task on the query service.

https://jira.toolserver.org/browse/DBQ-140

I don't know where this goes from here, but if anyone has any suggestion
please share.

Thanks,
Jim

On Mon, May 2, 2011 at 9:12 AM, Jim Hutchinson <[email protected]> wrote:

>
>
> On Fri, Apr 29, 2011 at 7:58 AM, Manish Goregaokar 
> <[email protected]>wrote:
>
>>
>>  1. Select 200 random articles.
>>>  2. Get the top contributors for each of them.
>>>  3. Get the edit counts for those contributors.
>>>
>>
>> I think he has the list/s of 200 articles, and does not want random ones.
>> Plus, he doesn't want the editcounts, he wants their top edited articles,
>> with the editcount per article.
>>
>> My personal opinion is that this HAS to be done via php (though I can't
>> comment of server load).
>> Use php-mysql to determine the list of top contributors per given article,
>> then loop for each contributor, and give *his* top edited articles...
>> Shouldn't be hard, though you might want to clarify what you mean by "top".
>> (Top 3? More than X edits? More than X% edits per day/week/month/beginning
>> of time? More than X% edits of the top editor?).
>>
>>
> Thanks again for the info. Yes, this is basically correct. I am looking to
> collect this info based on 100 articles from the Wikipedia science series.
> If the data proves relatively easy to collect, I like to collect data on all
> articles in the science series which is around 200 articles. Top
> contributors for me are those with 10 or more edits in the sampled article
> from the science series. For the sake of clarity, here is a short sample of
> the data I'm looking for.
>
> From the "science" article http://en.wikipedia.org/wiki/Science
>
> Clicking "view history" and then "contributors" gives a ranked list of all
> contributors in order of most edits.
>
>
> http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=en&wikifam=.wikipedia.org&grouped=on&page=Science
>
> The top three editors (lets call them A, B, and C) currently have 445, 73
> and 70 edits respectively. Clicking on contributor "A" to see their user
> page and then the "user contributions" from the tool box shows all their
> edits. For example, he/she has several edits to the articles "intelligent
> design" and "southern poverty law center", etc. and user "B" has edits to
> "rock formations" and "human evolution". I would like to count frequency of
> all these edits across the top users for the sampled (e.g. science) articles
> sorted by the article title.
>
> I don't know what the best way to arrange the data would be, but below is a
> Google Doc Spreadsheet that sort of shows what I think it would look like.
>
> http://goo.gl/VIWd6
>
> If the Query Service seems the best approach (is this done using the
> php-mysql referenced above or is it a different process?) then I will go
> ahead and create a task on https://jira.toolserver.org/browse/DBQ. If this
> is not the best or correct way to go any guidance is appreciated.
>
> Thanks.
>
> --
> Jim
>
_______________________________________________
Toolserver-l mailing list ([email protected])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Reply via email to