dr0ptp4kt added a comment.
After an update to the script (PS6) and a fresh run of the same commands new files have been `hdfs-rsync`'d to `stat1006:~dr0ptp4kt/gzips` in anticipation of doing a file transfer over to the WDQS graph split test servers. Here's a very small sample of what the files look like: $ zcat part-01022-c261bb68-4091-4613-ae52-88ce97d22c14-c000.txt.gz | tail -10 <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "\u0935\u093F\u0915\u093F\u092E\u093F\u0921\u093F\u092F\u093E \u0936\u094D\u0930\u0947\u0923\u0940"@ne . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "\u043A\u0430\u0442\u0435\u0433\u043E\u0440\u0438\u0458\u0430 \u043D\u0430 \u0412\u0438\u043A\u0438\u043C\u0435\u0434\u0438\u0458\u0438"@sr . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "\u7DAD\u57FA\u5A92\u9AD4\u5206\u985E"@yue . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "Wikimedia-Kategorie"@de-ch . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "catigur\u00ECa di nu pruggettu Wikimedia"@scn . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "categoria di un progetto Wikimedia"@it . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/version> "1979010859"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "kategori Wikimedia"@map-bms . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "Wikimedia-kategoriija"@se . <http://www.wikidata.org/entity/Q99896811> <http://schema.org/description> "\u7DAD\u57FA\u5A92\u9AD4\u5206\u985E"@zh-mo . $ zcat part-01023-c261bb68-4091-4613-ae52-88ce97d22c14-c000.txt.gz | head -10 <http://www.wikidata.org/entity/statement/Q99896811-7623BB4C-2D20-4D2E-8784-E2ED8AD3E8E5> <http://wikiba.se/ontology#rank> <http://wikiba.se/ontology#NormalRank> . <http://www.wikidata.org/entity/statement/Q99896811-7623BB4C-2D20-4D2E-8784-E2ED8AD3E8E5> <http://www.wikidata.org/prop/statement/P31> <http://www.wikidata.org/entity/Q4167836> . <http://www.wikidata.org/entity/statement/Q99896811-7623BB4C-2D20-4D2E-8784-E2ED8AD3E8E5> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#BestRank> . <https://ar.wikipedia.org/wiki/%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%B4%D8%B1%D9%83%D8%A7%D8%AA_%D8%B3%D9%88%D9%8A%D8%B3%D8%B1%D9%8A%D8%A9_%D8%A3%D8%B3%D8%B3%D8%AA_%D9%81%D9%8A_1973> <http://schema.org/about> <http://www.wikidata.org/entity/Q99896811> . <https://ar.wikipedia.org/wiki/%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%B4%D8%B1%D9%83%D8%A7%D8%AA_%D8%B3%D9%88%D9%8A%D8%B3%D8%B1%D9%8A%D8%A9_%D8%A3%D8%B3%D8%B3%D8%AA_%D9%81%D9%8A_1973> <http://schema.org/isPartOf> <https://ar.wikipedia.org/> . <https://ar.wikipedia.org/wiki/%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%B4%D8%B1%D9%83%D8%A7%D8%AA_%D8%B3%D9%88%D9%8A%D8%B3%D8%B1%D9%8A%D8%A9_%D8%A3%D8%B3%D8%B3%D8%AA_%D9%81%D9%8A_1973> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Article> . <https://ar.wikipedia.org/wiki/%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%B4%D8%B1%D9%83%D8%A7%D8%AA_%D8%B3%D9%88%D9%8A%D8%B3%D8%B1%D9%8A%D8%A9_%D8%A3%D8%B3%D8%B3%D8%AA_%D9%81%D9%8A_1973> <http://schema.org/inLanguage> "ar" . <https://ar.wikipedia.org/wiki/%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%B4%D8%B1%D9%83%D8%A7%D8%AA_%D8%B3%D9%88%D9%8A%D8%B3%D8%B1%D9%8A%D8%A9_%D8%A3%D8%B3%D8%B3%D8%AA_%D9%81%D9%8A_1973> <http://schema.org/name> "\u062A\u0635\u0646\u064A\u0641:\u0634\u0631\u0643\u0627\u062A \u0633\u0648\u064A\u0633\u0631\u064A\u0629 \u0623\u0633\u0633\u062A \u0641\u064A 1973"@ar . <https://en.wikipedia.org/wiki/Category:Swiss_companies_established_in_1973> <http://schema.org/inLanguage> "en" . <https://en.wikipedia.org/wiki/Category:Swiss_companies_established_in_1973> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Article> . You'll notice that the the files are partitioned by `context` and `subject`, and within a partition they're also sorted by `context` and `subject` (the `context` field isn't part of the output, though; one would get that from the source tables). So you may see, as in this example, things that are logically clustered together spanning from the end of one file and the beginning of the next partition in sequence. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt Cc: RKemper, EBernhardson, Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531, Astuthiodit_1, AWesterinen, 786, Biggs657, karapayneWMDE, Invadibot, maantietaja, Juan90264, Alter-paule, Beast1978, ItamarWMDE, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Neuronton, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org