Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/4525#issuecomment-73993809
  
    So, I'm all for the feature, but I'm not sold on the approach here. It 
makes the code a little confusing, since you're trying to keep the code working 
in two different modes: "lazy" for the startup check, and "synchronous" for the 
subsequent checks.
    
    Instead, why not always do it lazily? Have a thread pool with a few worker 
threads, and have `checkForLogs` feed requests for parsing logs to that pool. 
`checkForLogs` will only list the files, regardless of when it's executed 
(startup vs. not).
    
    That looks like it would be easier to understand, at least to me, would 
provide performance improvements for all subsequent checks (not just the 
initial one), and would simplify the code a lot (not having to deal with 
different types for lazy vs. not lazy app info, for one).
    
    What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to