JAllemandou added a comment.

  Chiming in: I suggest using Spark for investigations - Given the size of the 
dataset, parallel computation should help. This means another hop for the data: 
--> stat1004 --> HDFS. Please ping if you want/need help :)

TASK DETAIL
  https://phabricator.wikimedia.org/T239898

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou
Cc: JAllemandou, Gehel, elukey, dcausse, Aklapper, darthmon_wmde, DannyS712, 
Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to