dcausse added a comment.
1. Runs hdfs-rsync directly from the blazegraph hosts - requires installing its dependencies - open a holes between blazegraph and the hadoop cluster 2. Schedule hdfs-rsync on a stat machine copying the ttl dumps from hdfs to `/srv/analytics-search/wikibase_processed_dumps/wikidata/$SNAPSHOT` - cons: consumes some space on a stat machine 3. Run hdfs-rsync on-demand to copy the ttl dump from hdfs to `/srv/analytics-search/wikibase_processed_dumps/temp` and cleanup this folder once done - cons: slows down a bit a process I was planning on doing option 3, any objections with this approach? TASK DETAIL https://phabricator.wikimedia.org/T349069 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Daniel_Mietchen, JAllemandou, dr0ptp4kt, bking, BTullis, dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, KimKelting, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org