Mpaa added a comment.
In https://phabricator.wikimedia.org/T122047#1895830, @jayvdb wrote:
> The problem is that many `pagegenerators` classes/generators explicitly
> instantiate a `pywikibot.Page`, rather than using an `api.PageGenerator`,
> including at least the following:
>
> -
Mpaa added a comment.
Just for my understanding,
In https://phabricator.wikimedia.org/T122047#1895830, @jayvdb wrote:
> > We have code pending which can make proofreadpage and flow to be pywikibot
> > extension packages.
What does this mean/imply?
TASK DETAIL
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.
Change 250221 had a related patch set uploaded (by Mpaa):
pagegenerators.py: allow filtering by quality level
https://gerrit.wikimedia.org/r/250221
TASK DETAIL
https://phabricator.wikimedia.org/T122047
EMAIL PREFERENCES
Mpaa added a comment.
"What would be very cool is if pagegenerators had a proper filter version of
-cat, so that only pages emitted by another generator are checked for the
category, instead of fetching the entire category."
This should be a tracked separate task, it is not addressed in this
jayvdb added a comment.
In https://phabricator.wikimedia.org/T122047#1899384, @Mpaa wrote:
> In https://phabricator.wikimedia.org/T122047#1895830, @jayvdb wrote:
>
> > The problem is that many `pagegenerators` classes/generators explicitly
> > instantiate a `pywikibot.Page`, rather than using
Billinghurst added a subscriber: Billinghurst.
Billinghurst added a comment.
I also think that is quite pertinent if we want to review 'common typos'
if we can restrict to -ql:1
TASK DETAIL
https://phabricator.wikimedia.org/T122047
EMAIL PREFERENCES
jayvdb added a comment.
I absolutely agree that we need something better than `-intersect` for this,
and have no objection to the proposed `-ql` filter argument, though I would
like to see the components kept in proofread page modules as much as possible.
Upcasting to `ProofreadPage` in
Mpaa added a comment.
One typical use case for filtering quality levels is when one wants to run bots
on specific pages of a work.
E.g to work only on Validated pages:
python pwb.py listpages -prefixindex:"Page:Some title" -ql:4
Intersecting with a cat of 300.000 pages would be overkilling.