[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

srowen Sun, 13 Jul 2014 02:21:26 -0700

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/1391#issuecomment-48835727
  
    Yes of course, lots of settings' best or even usable values are ultimately 
app-specific. Ideally, defaults work for lots of cases. A flat value is the 
simplest of models, and anecdotally, the current default value does not work in 
medium- to large-memory YARN jobs. You can increase the default, but then the 
overhead gets silly for small jobs -- 1GB? And all of these are not-uncommon 
use cases.
    
    None of that implies the overhead logically scales with container memory. 
Empirically, it may do, and that's useful. Until the magic explanatory variable 
is found, which one is less problematic for end users -- a flat constant that 
frequently has to be tuned, or an imperfect model that could get it right in 
more cases? 
    
    That said it is kind of a developer API change and feels like something to 
not keep reimagining.
    
    Niskham can you share any anecdotal evidence about how the overhead 
changes. If executor memory is the only variable changing, that seems to be 
evidence against it being driven by other factors. but I don't know if that's 
what we know.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

Reply via email to