Carlb added a comment.

  OK, so what happened?
  
  - The script retrieved [[pt:*]] and found a bunch of interwikis on that page: 
[[en:*]] [[`~:*]] [[de:Asterix]] [[fr:Astérix]] [[it:Asterix (fumetto)]] 
[[ja:*]] [[nl:Asterix]] [[pl:Asterix]]
  - As [[pt:*]] already has a Wikibase item, it tried to follow each of the 
interwikis on the page to see if they could be merged to the existing item
  - self.try_to_merge(item) calls self.get_items() to retrieve the Wikibase 
Q-item number for every one of those other pages. Presumably, if it comes back 
with more than one Q-item number, that's a conflicting link (as appeared in the 
"Weird Al" Yankovic page example a few lines earlier) so the script will skip 
those. That seems to be the only reason it's retrieving all those items.
  - get_items() finds no repository at all on fr:uncyc (which is true, because 
it's an externally-hosted project). It should just treat that as their being no 
Q-item linked from the French page, but it doesn't do that... it fails to 
handle the error and exits.
  
  So now what? If scripts/interwikidata.py lines 156-169 look like this:
  
    def get_items(self):
        """Return all items of pages linked through the interwiki."""
        wd_data = set()
        for iw_page in self.iwlangs.values():
            if not iw_page.exists():
                warning('Interwiki {} does not exist, skipping...'
                        .format(iw_page.title(as_link=True)))
                continue
            try:
                wd_data.add(pywikibot.ItemPage.fromPage(iw_page))
            except pywikibot.NoPage:
                output('Interwiki {} does not have an item'
                       .format(iw_page.title(as_link=True)))
        return wd_data
  
  then there's a handler for NoPage but not one for an externally-hosted 
project having no direct access to the repo.
  
  Change that routine to this and the script will run:
  
    def get_items(self):
        """Return all items of pages linked through the interwiki."""
        wd_data = set()
        print 'get_items: ',self.iwlangs,' : ' ,self.iwlangs.values()
        for iw_page in self.iwlangs.values():
            if not iw_page.exists():
                warning('Interwiki {} does not exist, skipping...'
                        .format(iw_page.title(as_link=True)))
                continue
            try:
                print ('- wd_data ',wd_data)
                print ('- adding ',pywikibot.ItemPage.fromPage(iw_page))
                wd_data.add(pywikibot.ItemPage.fromPage(iw_page))
            except pywikibot.NoPage:
                output('Interwiki {} does not have an item'
                       .format(iw_page.title(as_link=True)))
            except pywikibot.WikiBaseError:
                output('Site {} has no Wikibase repository'
                       .format(iw_page.title(as_link=True)))
        print ('wd_data: ',wd_data)
        return wd_data
  
  as a WikiBaseError (which will occur if a wiki has no repo access) will be 
treated the same way as the page being missing or containing no Wikibase link.

TASK DETAIL
  https://phabricator.wikimedia.org/T221556

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Carlb
Cc: Dvorapa, Ladsgroup, Xqt, Aklapper, pywikibot-bugs-list, Carlb, alaa_wmde, 
DannyS712, Nandana, Wenyi, Lahi, Gq86, GoranSMilovanovic, QZanden, Tbscho, 
MayS, LawExplorer, Mdupont, JJMC89, _jensen, rosalieper, Avicennasis, 
mys_721tx, Wikidata-bugs, aude, jayvdb, Dalba, Masti, Alchimista, Mbch331, Rxy
_______________________________________________
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to