[ https://issues.apache.org/jira/browse/SPARK-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-4539: ----------------------------- Component/s: Spark Core > History Server counts "incomplete" applications against the > "retainedApplications" total, fails to show eligible "completed" applications > ----------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-4539 > URL: https://issues.apache.org/jira/browse/SPARK-4539 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.2.0 > Reporter: Ryan Williams > > I have observed the history server to return 0 or 1 applications from a > directory that contains many complete and incomplete applications (the latter > being application directories that are missing the {{APPLICATION_COMPLETE}} > file). > Without having dug too much, my theory is that HistoryServer is seeing the > "incomplete" directories and counting them against the > {{retainedApplications}} maximum but not displaying them. > One supporting anecdote for this is that I loaded HS against a directory that > had one complete application and nothing else, and HS worked as expected (I > saw the one application in the web UI). > I then copied ~100 other application directories in, the majority of which > were "incomplete" (in particular, most of the ones that had the earliest > timestamps), and still only saw the one original completed application via > the web UI. > Finally, I restarted the same server with the {{retainedApplications}} set to > 1000 (instead of 50; the directory a this point had ~10 completed > applications and 90 incomplete ones), and saw all/exactly the completed > applications, leading me to believe that they were being "boxed out" of the > maximum-50-retained-applications iteration of the history server. > Silently failing on "incomplete" directories while still docking the count, > if that is indeed what is happening, is a pretty confusing failure mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org