-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Rohde:
> The starting point is providing full-text history availability and once you
> have that there are a number of different projects (like wikiblame) which
> would desire to pull and process every revision in some way.

okay, so full text access has been a 'would be nice' thing for a while.  i
added an item to this year's shopping list for it.

it seems more useful to provide the text in uncompressed form, instead of the
MediaWiki internal form that's almost impossible to work with.  does that seem
reasonable?

> Some of the code I've worked with would probably take weeks to run
> single-threaded against enwiki, but that can be made practical if one is
> willing to throw enough cores at the problem.

well, this probably isn't something we could afford ourselves, but if there's
enough interest in a batch computing infrastructure, it's probably worth
talking to external organisations about this.

> From an exterior point of view it often seems like toolserver is
> significantly lagged or tools are going down, and from that I have generally
> assumed that it operates relatively close to capacity a lot of the time.

that is correct.  the way it works is we run at or over capacity for a while,
until we can afford new hardware, then things are fast for a while, until we
reach capacity again.  this repeats every year or so.  (interestingly, this is
exactly how Wikipedia worked in the first few years.)

        - river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (HP-UX)

iEYEARECAAYFAkm3iigACgkQIXd7fCuc5vKo+ACfS62b7U0dF+EtTcLcrEBHE22I
h1QAoItjhW1XYmzRl3KyJDFmxQ4nMvye
=jvq3
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to