dr0ptp4kt added a subscriber: EBernhardson.
dr0ptp4kt added a comment.

  Adding a note so I don't forget: advice from @BTullis is to avoid NFS if 
possible, and advice from @JAllemandou is to consider use of `hdfs-rsync` 
(after our call I sought this out and found these: 
https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/refinery/+/refs/heads/master/python/refinery/hdfs.py
 and 
https://gerrit.wikimedia.org/g/analytics/hdfs-tools/deploy/+/2445aec92f6b3d409531fb74ab3f9a22d9716823/bin/hdfs-rsync
 and 
https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/refinery/+/refs/heads/master/bin/hdfs-rsync
 ). Chances are we'd need to add a ferm and possibly where up some Kerberos 
stuff on the WDQS servers if going the hdfs-rsync route.
  
  During a Meet today @EBernhardson and I with the group were discussing 
possible use of a mechanism similar to 
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/search/shared/transfer_to_es.py?ref_type=heads#L74-83
 and 
https://gitlab.wikimedia.org/repos/search-platform/mjolnir/-/blob/main/mjolnir/kafka/bulk_daemon.py?ref_type=heads
 where a file is moved to Swift via Airflow and Mjolnir client code listens for 
the Kafka events of the URLs from which to fetch the produced files (I haven't 
read this code closely yet, just parroting what I think I heard).
  
  We'll likely need to do these data transfers more than once, so it'll be good 
to have some level of support of automation.

TASK DETAIL
  https://phabricator.wikimedia.org/T350106

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dr0ptp4kt
Cc: EBernhardson, Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, 
Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to