What is the best way to JOIN two 10TB csv files and three 100kb files on Spark?

Dear all,

The new DataFrame of spark is extremely fast. But out cluster have limited
RAM (~500GB).


What is the best way to do such a big table Join?

Any sample code is greatly welcome!


Best,
Rex

Reply via email to