Bugs item #2619054, was opened at 2009-02-20 03:04
Message generated for change (Comment added) made by russblau
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2619054&group_id=93107

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: rewrite
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: NicDumZ — Nicolas Dumazet (nicdumz)
Assigned to: Russell Blau (russblau)
Summary: clarify between limit, number, batch and step parameters

Initial Comment:
I had a strange behavior of replace.py -weblink: that I couldn't quite 
diagnose: some pages were not treated.

First of all, those detailed logs are a great gift. They are a bit messy to 
understand at first, but thanks to those I found the bug and fixed it in r6386 
( http://svn.wikimedia.org/viewvc/pywikipedia?view=rev&revision=6386 ).

I believe that this parameter confusion is a very bad habit we have from the 
old framework. (the only reason there we have those bugs is because we merged 
pagegenerators from trunk.) We need to agree on common parameters for 
generators that have a global meaning, and stick to it.

I personally think that -limit might be a bit confusing (is it an api limit, a 
limit enforced by the local application on a huge fetched set, etc ?), while 
-number appears a bit more clear. But it's a personal opinion =)
What about -number for "number of items to retrieve", and -step, or -maxstep 
for the maximum number of items to retrieve at once ? 
Actually, I don't mind about the names; we just need to agree on something 
meaningful enough, and document them in the file headings.

On a sidenote, replace.py -fix:yu-tld -weblink:*.yu is actually running on 
fr.wp. No issues sighted. =)

----------------------------------------------------------------------

>Comment By: Russell Blau (russblau)
Date: 2009-02-20 10:00

Message:
A good point.  A query can have two different types of limits: the limit on
the number of pages/links/whatever retrieved from the API in a single
request (defaults to "max"), and the limit on the total number of items to
be retrieved from a repeated query.  We should do this in a way that is (a)
internally consistent among all generators, and (b) as much as possible,
backwards-compatible with the old pagegenerators module (but this is
secondary to getting something that works).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2619054&group_id=93107

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to