You should aggregate your files in larger chunks before doing anything
else. HDFS is not fit for small files. It will bloat it and cause you a
lot of performance issues. Target a few hundred MB chunks partition size
and then save those files back to hdfs and then delete the original
ones. You can
Hi, all spark processes are saved in the Spark History Server
look at your host on port 18080 instead of 4040
François
Le 2015-08-07 15:26, saif.a.ell...@wellsfargo.com a écrit :
Hi,
A silly question here. The Driver Web UI dies when the spark-submit
program finish. I would like some time
François
Le 2015-08-07 15:58, saif.a.ell...@wellsfargo.com a écrit :
Hello, thank you, but that port is unreachable for me. Can you please
share where can I find that port equivalent in my environment?
Thank you
Saif
*From:*François Pelletier [mailto:newslett...@francoispelletier.org