dcausse added a comment.

  1. Runs hdfs-rsync directly from the blazegraph hosts
    - requires installing its dependencies
    - open a holes between blazegraph and the hadoop cluster
  2. Schedule hdfs-rsync on a stat machine copying the ttl dumps from hdfs to 
`/srv/analytics-search/wikibase_processed_dumps/wikidata/$SNAPSHOT`
    - cons: consumes some space on a stat machine
  3. Run hdfs-rsync on-demand to copy the ttl dump from hdfs to 
`/srv/analytics-search/wikibase_processed_dumps/temp` and cleanup this folder 
once done
    - cons: slows down a bit a process
  
  I was planning on doing option 3, any objections with this approach?

TASK DETAIL
  https://phabricator.wikimedia.org/T349069

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Daniel_Mietchen, JAllemandou, dr0ptp4kt, bking, BTullis, dcausse, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, AWesterinen, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Namenlos314, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, 
KimKelting, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to