Hi,
Thank you both for your suggestions! These have been eyeopeners for me.
Just to clarify, I need the counts for logging and auditing purposes
otherwise I would exclude the step. I should have also mentioned that
while I am processing around 30 GB of raw data, the individual outputs are
relat
Ashley,
I want to suggest a few optimizations. The problem might go away but at
least performance should improve.
The freeze problems could have many reasons, the Spark UI SQL pages and
stages detail pages would be useful. You can send them privately, if you
wish.
1. the repartition(1) shoul
Any feedback please?
Thanks,
Debu
Sent from my iPhone
> On 13-Feb-2020, at 6:36 PM, Debabrata Ghosh wrote:
>
>
> Greetings All !
>
> I have got plenty of application directories lying around sparkStaging , such
> as .sparkStaging/application_1580703507814_0074
>
> Would you please be able
Greetings All !
I have got plenty of application directories lying around sparkStaging ,
such as .sparkStaging/application_1580703507814_0074
Would you please be able to help advise me which variable I need to set in
spark-env.sh so that the sparkStaging applications aren't preserved after
the ru