Re: Ahhhh... Spark creates >30000 partitions... What can I do?

2015-10-20 Thread François Pelletier
You should aggregate your files in larger chunks before doing anything else. HDFS is not fit for small files. It will bloat it and cause you a lot of performance issues. Target a few hundred MB chunks partition size and then save those files back to hdfs and then delete the original ones. You can

Re: Spark master driver UI: How to keep it after process finished?

2015-08-07 Thread François Pelletier
Hi, all spark processes are saved in the Spark History Server look at your host on port 18080 instead of 4040 François Le 2015-08-07 15:26, saif.a.ell...@wellsfargo.com a écrit : Hi, A silly question here. The Driver Web UI dies when the spark-submit program finish. I would like some time

Re: Spark master driver UI: How to keep it after process finished?

2015-08-07 Thread François Pelletier
François Le 2015-08-07 15:58, saif.a.ell...@wellsfargo.com a écrit : Hello, thank you, but that port is unreachable for me. Can you please share where can I find that port equivalent in my environment? Thank you Saif *From:*François Pelletier [mailto:newslett...@francoispelletier.org