On Wed, Jul 9, 2014 at 6:13 PM, Daniel Kinzler <daniel.kinz...@wikimedia.de>
wrote:

> Am 09.07.2014 08:14, schrieb Dimitris Kontokostas:
> > Hi,
> >
> > Is it easy to brief the added value (or supported use cases) by switching
> > to PubSubHubbub?
>
> * It's easier to handle than OAI, because it uses the standard dump format.
> * It's also push-based, avoiding constant polling on small wikis.
> * The OAI extension has been deprecated for a long time now.
>
> > The edit stream in Wikidata is so huge that I can hardly think of anyone
> wanting
> > to be in *real-time* sync with Wikidata
> > With 20 p/s their infrastructure should be pretty scalable to not break.
>
> The "push" aspect is probably most useful for small wikis. It's true, for
> large
> wikis, you could just poll, since you would hardly ever poll in vain.
>
> IT would be very nice if the sync could be filtered by namespace,
> category, etc.
> But PubSubHubbub (i'll use "PuSH" from now on) doesn't really support
> this, sadly.
>
> > Maybe I am biased with DBpedia but by doing some experiments on English
> > Wikipedia we found that the ideal update with OAI-PMH time was every ~5
> minutes.
> > OAI aggregates multiple revisions of a page to a single edit
> > so when we ask: "get me the items that changed the last 5 minutes" we
> skip the
> > processing of many minor edits
> > It looks like we lose this option with PubSubHubbub right?
>
> I'm not quite positive on this point, but I think with PuSH, this is done
> by the
> hub. If the hub gets 20 notifications for the same resource in one minute,
> it
> will only grab and distribute the latest version, not all 20.
>
> But perhaps someone from the PuSH development team could confirm this.
>

It 'd be great if the dev team can confirm this.
Besides push notifications, is polling an option in PuSH? I briefed through
the spec but couldn't find this.


>
> > As we already asked before, does PubSubHubbub supports mirroring a
> wikidata
> > clone? The OAI-PMH extension has this option
>
> Yes, there is a client extension for PuSH, allowing for seemless
> replication of
> one wiki into another, including creation and deletion (I don't know about
> moves/renames).
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>



-- 
Kontokostas Dimitris
_______________________________________________
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Reply via email to