[ 
https://issues.apache.org/jira/browse/SPARK-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-4539:
-----------------------------
    Component/s: Spark Core

> History Server counts "incomplete" applications against the 
> "retainedApplications" total, fails to show eligible "completed" applications
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4539
>                 URL: https://issues.apache.org/jira/browse/SPARK-4539
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Ryan Williams
>
> I have observed the history server to return 0 or 1 applications from a 
> directory that contains many complete and incomplete applications (the latter 
> being application directories that are missing the {{APPLICATION_COMPLETE}} 
> file).
> Without having dug too much, my theory is that HistoryServer is seeing the 
> "incomplete" directories and counting them against the 
> {{retainedApplications}} maximum but not displaying them.
> One supporting anecdote for this is that I loaded HS against a directory that 
> had one complete application and nothing else, and HS worked as expected (I 
> saw the one application in the web UI).
> I then copied ~100 other application directories in, the majority of which 
> were "incomplete" (in particular, most of the ones that had the earliest 
> timestamps), and still only saw the one original completed application via 
> the web UI.
> Finally, I restarted the same server with the {{retainedApplications}} set to 
> 1000 (instead of 50; the directory a this point had ~10 completed 
> applications and 90 incomplete ones), and saw all/exactly the completed 
> applications, leading me to believe that they were being "boxed out" of the 
> maximum-50-retained-applications iteration of the history server.
> Silently failing on "incomplete" directories while still docking the count, 
> if that is indeed what is happening, is a pretty confusing failure mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to