Arkanosis added a comment.

I ran the conversion directly from the ttl.gz file

Interesting, I couldn’t get that to work and had to pipe gunzip output into the program.

Interesting, indeed… Could it be that you added the -f ttl flag afterwards? I couldn't get it to accept a gzip file as input without this flag (I assume it does file format detection based on the file extension).

Also, I had to install zlib-devel to get rdfhdt to compile on a CentOS 6 container — there might be some non-zlib-enabled build on Debian that isn't available on RedHat.

I also tried converting the latest dump, and since I don’t have access to any system with that much RAM, I thought I could perhaps trade some execution time for swap space. Bad idea :) the process got through 20% of the input file and then slowed to a crawl, at data rates of single-digit kilobytes per second. It would’ve taken half a year to finish at that rate.

Thanks for testing! That would have required a hell lot of swap space anyway. Easy to setup for whoever does this on a regular basis, but for casual needs, I've never seen a machine with 200+ GiB of swap space.

But FWIW, here’s the command I used, with a healthy dose of systemd sandboxing since it’s a completely unknown program I’m running:
<snip>

Thanks for sharing the sandboxing bits! :-)


TASK DETAIL
https://phabricator.wikimedia.org/T179681

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Arkanosis
Cc: Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to