jin xing created SPARK-21270:
--------------------------------

             Summary: Improvement for memory config.
                 Key: SPARK-21270
                 URL: https://issues.apache.org/jira/browse/SPARK-21270
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.1.1
            Reporter: jin xing


1. For executor memory, we have {{spark.executor.memory}} for heap size, and 
{{spark.memory.offHeap.size}} for off-heap size, and these 2 together is the 
total memory consumption for each executor process.
>From the user side, what they always care is the total memory consumption, no 
>matter it is on-heap or off-heap. It seems that it is more friendly to have 
>only one memory config for the user.
Can we merge the two configs to be one, and hide the complexity within internal 
system?
2. {{spark.memory.offHeap.size}} is originally designed for {{MemoryManager}}, 
which is to manage off-heap memory explicitly allocated by Spark itself when 
creating its own buffers / pages or caching blocks, not to account for off-heap 
memory used by lower-level code or third-party libraries, for example Netty. 
But {{spark.memory.offHeap.size}} and {{spark.memory.offHeap.enable}} are more 
or less confusing. Sometimes user can ask -- "I've already set 
{{spark.memory.offHeap.enable}} to be false, but why Netty is reading remote 
blocks to off-heap?". Also I think we need to document more about
{{spark.memory.offHeap.size}} and {{spark.memory.offHeap.enable}} on 
http://spark.apache.org/docs/latest/configuration.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to