Carlb created this task.
Carlb added projects: Pywikibot, MediaWiki-extensions-WikibaseRepository.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  Wikibase and the associated pywikibot scripts tend to make a lot of 
assumptions about the way a wiki family is structured, ranging from the 
WMF-style naming convention ( T172076 
<https://phabricator.wikimedia.org/T172076> ) to the database name matching the 
GlobalID ( T221550 <https://phabricator.wikimedia.org/T221550> ), which is 
invariably (ISO language code xx) + a group name (invariably 'wiki*' or 
'wiktionary').
  
  In some cases, even if the Wikibase repository allows something (such as 
making outbound interlanguage links to an externally-hosted site)  the 
Pywikibot scripts apply their own restrictions.
  
  That's not a huge issue for the WMF wiki families (where every inter-language 
link points to a wiki in the same cluster with SQL access to/from the same 
repository), but is an issue for third-party projects (such as the Uncyclopedia 
family) which allow individual languages to host themselves wherever they like.
  
  If the majority of the languages are on one server cluster (for instance, 
*.uncyclopedia.info) but one language is on an independent server, the core 
Wikibase repo will find the API for the independent project from the sites 
table ('kouncyc', 'https://uncyclopedia.kr/wiki/$1', 
'https://uncyclopedia.kr/w/$1') and use that API to determine if an article 
exists on the independent wiki whenever a user manually adds an interlanguage 
link to the Wikidata repo. (That won't force the externally-hosted wiki to link 
back to us, but it does centralise the creation of outbound links from the 
cluster - which is convenient.)
  
  Pywikibot, on the other hand, is less forgiving. When interwikidata.py finds 
interwiki links on a page, it does this:
  
    def get_items(self):
        """Return all items of pages linked through the interwiki."""
        wd_data = set()
        for iw_page in self.iwlangs.values():
            if not iw_page.exists():
                warning('Interwiki {} does not exist, skipping...'
                        .format(iw_page.title(as_link=True)))
                continue
            try:
                wd_data.add(pywikibot.ItemPage.fromPage(iw_page))
            except pywikibot.NoPage:
                output('Interwiki {} does not have an item'
                       .format(iw_page.title(as_link=True)))
        return wd_data
  
  which causes pywikibot.ItemPage.fromPage() - a call to page.py - to 
interrogate the API for every language site linked and ask for the repository 
URL for each.
  
  Wikipedia will likely give sane answers, while an independently-hosted 
Uncyclopedia will more likely answer to a request for the repository URI like 
this:
  
    Pywikibot:  Please, please good people.  I am in haste.  Who lives in that 
castle? 
    kouncyc:  No one lives there.  We don't have a lord.  We're an 
anarcho-syndicalist commune...
  
  descending through:
  
    Pywikibot:  Be quiet!  I order you to be quiet!  I am your king!
    kouncyc:  Well, I didn't vote for you.
    Pywikibot:  You don't vote for kings.
  
  and, when the "take me to your leader" demands for the identity of a central 
repository for the externally-hosted project inevitably fail, ending with:
  
    Pywikibot: I am going to throw you so far...
     if not page.site.has_data_repository:
        raise pywikibot.WikiBaseError('{0} has no data repository'
                                      ''.format(page.site))
    
    kouncyc:  Help, help, I'm being oppressed! See the violence inherent in the 
system...
    Pywikibot:  Bloody peasant!
    kouncyc: Dead giveaway, that...
  
  Such are the hazards of giving Uncyclopedians a Python script to run. The 
outcome is a comedy of errors. It's just not particularly useful.
  
  There is no code in interwikidata.py to recover from finding the one 
independent that has "no lord" (or no central repository access) so, instead of 
merely adding [[ko:]] as an outbound only link, the whole thing very abruptly 
ends. There is no handler for this sort of condition:
  
    except pywikibot.WikiBaseError:
        output('Site {} has no Wikibase repository'
               .format(iw_page.title(as_link=True)))
  
  What would be the best way to change this script so that, when it finds the 
one externally-hosted independent which doesn't have access to the repository, 
it merely creates the outbound link (one-way) from the projects which are on 
the cluster with the repository and continues in some sane manner instead of 
revolting?

TASK DETAIL
  https://phabricator.wikimedia.org/T221556

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Carlb
Cc: Aklapper, pywikibot-bugs-list, Carlb, alaa_wmde, Nandana, Wenyi, Lahi, 
Gq86, GoranSMilovanovic, QZanden, Tbscho, MayS, LawExplorer, Mdupont, JJMC89, 
_jensen, rosalieper, Avicennasis, mys_721tx, Wikidata-bugs, aude, jayvdb, 
Masti, Alchimista, Mbch331, Rxy
_______________________________________________
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to