Hi!
Thank you for the reply. I made the following tasks:
https://phabricator.wikimedia.org/T298436
https://phabricator.wikimedia.org/T298437
Mitar
On Sat, Jan 1, 2022 at 6:07 PM Ariel Glenn WMF wrote:
>
> Hello Mitar! I'm glad you are finding the Wikimedia Enterprise dumps useful.
>
> For
Hello Mitar! I'm glad you are finding the Wikimedia Enterprise dumps useful.
For your tar.gz question, this is the format that the Wikimedia Enterprise
dataset consumers prefer, from what I understand. But I would suggest that
if you are interested in other formats, you might open a task on
Hi!
Awesome!
Is there any reason they are tar.gz files of one file and not simply
bzip2 of the file contents? Wikidata dumps are bzip2 of one json and
that allows parallel decompression. Having both tar (why tar of one
file at all?) and gz in there really requires one to first decompress
the
Wow very cool!
On Tue, Oct 19, 2021 at 10:57 AM Ariel Glenn WMF
wrote:
> I am pleased to announce that Wikimedia Enterprise's HTML dumps [1] for
> October 17-18th are available for public download; see
> https://dumps.wikimedia.org/other/enterprise_html/ for more information.
> We
> expect to