[GitHub] spark issue #22926: [SPARK-25917][Spark UI] memoryMetrics should be Json ign...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22926 Thanks @vanzin . I was using 2.3, and with your comment, I found there was one check-in about one month ago that handled this case. Will close this PR, and sorry for the misreporting, I will keep in mind testing trunk before reporting next time. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22926: [SPARK-25917][Spark UI] memoryMetrics should be J...
Github user jianjianjiao closed the pull request at: https://github.com/apache/spark/pull/22926 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22926: [SPARK-25917][Spark UI] memoryMetrics should be Json ign...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22926 @AmplabJenkins Could you please find someone to review this? I believe this is a bug in Spark UI. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22926: [SPARK-25917][Spark UI] memoryMetrics should be Json ign...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22926 @mccheah, @smurakozi @vanzin Could you please help take a look at this PR? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22926: [SPARK-25917][Spark UI] memoryMetrics should be J...
GitHub user jianjianjiao opened a pull request: https://github.com/apache/spark/pull/22926 [SPARK-25917][Spark UI] memoryMetrics should be Json ignored when being none ## What changes were proposed in this pull request? Spark UI's executors page loads forever when memoryMetrics in None. Fix is to JSON ignore memorymetrics when it is None. ## How was this patch tested? Before fix: (loads forever) ![image](https://user-images.githubusercontent.com/1785565/47875681-64dfe480-ddd4-11e8-8d15-5ed1457bc24f.png) After fix: ![image](https://user-images.githubusercontent.com/1785565/47875691-6b6e5c00-ddd4-11e8-9895-db8dd9730ee1.png) That is because code in executorspage.js, Line 268 exec.memoryMetrics = exec.hasOwnProperty('memoryMetrics') ? exec.memoryMetrics : memoryMetrics; You can merge this pull request into a Git repository by running: $ git pull https://github.com/jianjianjiao/spark users/rotang/FixExecutorsPageLoadingForever Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22926.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22926 commit 5826424c931fbba81cc246c3b1afe3f64626e051 Author: Rong Tang Date: 2018-11-01T19:37:45Z mmemoryMetrics should not json ignored when being none --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22520: [SPARK-25509][Core]Windows doesn't support POSIX permiss...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22520 @srowen that makes sense, I will be more patient next time. ^_^. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22520: [SPARK-25509][Core]Windows doesn't support POSIX permiss...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22520 @srowen @vanzin tests passed. What should I do now to make it approved to merge? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22520: [SPARK-25509][Core]Windows doesn't support POSIX permiss...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22520 @srowen Thanks for confirmation. I have sent out new iteration. Could you please authorize testing on this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22520: [SPARK-25509][Core]Windows doesn't support POSIX permiss...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22520 @srowen thanks for reviewing this PR, and your comments. 1. have fixed the coding style, thanks. 2. These are the only 2 places using PosixFilePermissions to handle file operations. In fact, Maybe the way for windows(first create directories, and then chmod700) is OK for both windows and other OS. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22444: [SPARK-25409][Core]Speed up Spark History loading via in...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22444 @squito Yes, you are correct. I was trying to make the applications running during the scan be picked up quicker. It turns out the SPARK-6951 has done great job in achieving this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22444: [SPARK-25409][Core]Speed up Spark History loading via in...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22444 @vanzin Really thanks for you suggestions. It becomes much faster loading event logs. from more than 2.5 hours, to 19 minutes, loading 17K event logs, some of them are larger than 10G. 1. To enable SHS V2 to caching things on disk. We are using Windows, there is a small "posix.permissions not supported in windows" issue, I create a new PR here https://github.com/apache/spark/pull/22520 , could you please take a look? This change doesn't speed up loading very much, but it improves other part. 2. Tried 2.4, and also tried applying SPARK-6951 to 2.3. this is the critical part improving the speed. I will close this PR, as it is useless now. Thanks again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22444: [SPARK-25409][Core]Speed up Spark History loading...
Github user jianjianjiao closed the pull request at: https://github.com/apache/spark/pull/22444 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22520: [SPARK-25509][Core]Windows doesn't support POSIX ...
GitHub user jianjianjiao opened a pull request: https://github.com/apache/spark/pull/22520 [SPARK-25509][Core]Windows doesn't support POSIX permissions ## What changes were proposed in this pull request? SHS V2 cannot enabled in Windoes, because windows doesn't support POSIX permission. ## How was this patch tested? test case fails in windows without this fix. org.apache.spark.deploy.history.HistoryServerDiskManagerSuite test("leasing space") You can merge this pull request into a Git repository by running: $ git pull https://github.com/jianjianjiao/spark FixWindowsPermssionsIssue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22520.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22520 commit fe74feeef42fc6fb6fb5f5e869e23b349f3a1697 Author: Rong Tang Date: 2018-09-21T17:07:44Z Windows doesn't support Posix permissions --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22444: [SPARK-25409][Core]Speed up Spark History loading...
Github user jianjianjiao commented on a diff in the pull request: https://github.com/apache/spark/pull/22444#discussion_r218292773 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -465,20 +475,31 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) } } catch { case _: NoSuchElementException => - // If the file is currently not being tracked by the SHS, add an entry for it and try - // to parse it. This will allow the cleaner code to detect the file as stale later on - // if it was not possible to parse it. - listing.write(LogInfo(entry.getPath().toString(), newLastScanTime, None, None, -entry.getLen())) --- End diff -- Hi, @squito thanks for looking into this PR. When Spark history starts, it will scan event logs folder, and using multi-threads to handle. it will not do next scan before the first finishes. That is the problem, in our cluster, there are about 20K event-log files(often bigger than 1G), including like 1K .inprogress files, it takes about 2 and a half hours to do the first scan. that means, during this 2.5 hours, if an user submit a spark application, and it finishes, user cannot find it via the spark history UI, and has to wait for the next scan. That is why I add a limit of how much to scan each time, like set to 3K. That means no matter how many log files in the event-logs folder, it will first scan the first 3K and handle them, and then do the second scan, let's assume that during the first scan, there are 5 applications scanned, and there are another 10 applications updated. then the second scan will handle these 15 applications and another 2885 files ( from 3001 to 5885) in the event folder. checkForLogs scan event-log folders, and only handles files that are updated or not handled. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22444: [SPARK-25409][Core]Speed up Spark History loading via in...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22444 Add @vanzin @steveloughran @squito who made changes to related code. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22444: implement incremental loading and add a flag to l...
GitHub user jianjianjiao opened a pull request: https://github.com/apache/spark/pull/22444 implement incremental loading and add a flag to load incomplete or not ## What changes were proposed in this pull request? 1. Instead of loading all event logs in every loading, load only a certain amount of event logs. That is because if there are tens of thousands of event logs, loading all of them take long time. 2. If we run Spark on Yarn, Spark jobs information can be obtained by Yarn Application master, this is no need to load incomplete applications, so add a flag not to load them. ## How was this patch tested? This is tested manually in our production cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jianjianjiao/spark speedUpSparkHistoryLoading Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22444.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22444 commit 1190ffcb109025bd62c909059b0cf16e6a748de9 Author: Rong Tang Date: 2018-09-17T22:00:23Z implement incremental loading and add a flag to load incomplete or not --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org