On Wed, Jan 7, 2009 at 9:53 PM, Brion Vibber <br...@wikimedia.org> wrote:
>> The current version of my compressor averaged a little better than 250
>> revisions per second on ruwiki (about 12 hours total) on a
>> 18-month-old desktop.  However, as the CPU utilization was only 50-70%
>> of a full processing core most of the time, I suspect that my choice
>> to read and write from an external hard drive may have been the
>> limiting factor.  On a good machine, 400+ rev/s might be a plausible
>> number for the current code.
>
> It'd be good to compare this against the general-purpose bzip2 and 7zip
> LZMA compression...

I started a process to recompress the ruwiki dump using the default
settings on 7-Zip.  After 5 minutes, it told me I had 16 hours
remaining.  So I would estimate that my revision compressor is on the
same timescale and perhaps somewhat faster than 7-Zip.  Again I was
reading and writing to an external drive so there could be i/o effect
in there as well.

> it'd be great if we
> can host the dev code in source control, under extensions or tools for
> now, until we can integrate something directly into the export code.

Could someone walk me through how I would do that?

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to