Hi,

On Fri, Apr 26, 2013 at 5:29 AM, Sebastian Hellmann
<hellm...@informatik.uni-leipzig.de> wrote:
> Well, PubSubHubbub is a nice idea. However it clearly depends on two factors:
> 1. whether Wikidata sets up such an infrastructure (I need to check whether 
> we have capacities, I am not sure atm)

Capacity for what? the infrastructure should be not be a problem.
(famous last words, can look more closely tomorrow. but I'm really not
worried about it) And you don't need any infrastructure at all for
development; just use one of google's public instances.

> 2. whether performance is good enough to handle high-volume publishers

Again, how do you mean?

> Basically, polling to recent changes [1] and then do a http request to the 
> individual pages should be fine for a start. So I guess this is what we will 
> implement, if there aren't any better suggestions.
> The whole issue is problematic and the DBpedia project would be happy, if 
> this were discussed and decided right now, so we can plan development.
>
> What is the best practice to get updates from Wikipedia at the moment?

I believe just about everyone uses the IRC feed from
irc.wikimedia.org.
https://meta.wikimedia.org/wiki/IRC/Channels#Raw_feeds

I imagine wikidata will or maybe already does propagate changes to a
channel on that server but I can imagine IRC would not be a good
method for many Instant data repo users. Some will not be able to
sustain a single TCP connection for extended periods, some will not be
able to use IRC ports at all, and some may go offline periodically.
e.g. a server on a laptop. AIUI, PubSubHubbub has none of those
problems and is better than the current IRC solution in just about
every way.

We could potentially even replace the current cross-DB job queue
insert crazyness with PubSubHubbub for use on the cluster internally.

-Jeremy

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to