Github user jianjianjiao commented on the issue:
https://github.com/apache/spark/pull/22444
@squito Yes, you are correct. I was trying to make the applications
running during the scan be picked up quicker. It turns out the SPARK-6951 has
done great job in achieving this.
-
Github user jianjianjiao commented on the issue:
https://github.com/apache/spark/pull/22444
@vanzin Really thanks for you suggestions. It becomes much faster loading
event logs. from more than 2.5 hours, to 19 minutes, loading 17K event logs,
some of them are larger than 10G.
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/22444
> so any server restart results in hours of downtime, just from scanning.
Well, that's why 2.3 supports caching things on disk. Also, 2.4 has
SPARK-6951 which should make this a lot faster ev
Github user squito commented on the issue:
https://github.com/apache/spark/pull/22444
> history server startup needs to go through all these logs before being
usable, so any server restart results in hours of downtime, just from scanning.
I don't think this is true. The first
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/22444
I see the reasoning here
* @jianjianjiao has a very large cluster with many thousands of history
files of past (successful) jobs.
* history server startup needs to go through all t
Github user jianjianjiao commented on the issue:
https://github.com/apache/spark/pull/22444
Add @vanzin @steveloughran @squito who made changes to related code.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spar