Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15248#discussion_r81126007
  
    --- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
    @@ -179,7 +180,11 @@ class HistoryServer(
       }
    --- End diff --
    
    OK, let me ask a different question -- when would you ever want to call 
getApplicationInfoList, given this problem? the other usages seem to just pick 
out a single element, so it's wasteful for the same reason. It just seems weird 
to have two methods for this.
    
    The existing method already only promises an Iterator which seems like the 
right thing to do. The problem is that it gets there by materializing a whole 
collection. Why is that unavoidable -- you can map() an iterator's elements 
without any evaluation of its elements or needing to first materialize the 
collection. 
    
    That's why I think getApplicationInfo is really what needs to change to 
provide something else like an Iterator, and that ends up being pretty easy -- 
FsHistoryProvider just has to give an iterator over applications.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to