hi, Raúl 

First, the most of the OS memory cache is used by  Page Cache
<https://en.wikipedia.org/wiki/Page_cache>   which OS use for caching the
recent read/write I/O.

I think the understanding of OS memory cache should be discussed in two
different perspectives. From a perspective of 
user-space (e.g, Spark application), it is not used, since the Spark is not
allocating memory from this part of memory. 
However, from a perspective of OS, it is actually used, because the memory
pages are already allocated for caching the 
I/O pages. For each I/O request, the OS always allocate memory pages to
cache it to expect these cached I/O pages can be reused in near future. 
Recall, you use vim/emacs to open a large file. It is pretty slow when you
open it at the first time. But it will be much faster when you close it and
open it immediately because the file has been cached in file cache at the
first time you open it.

It is hard for Spark to use this part of memory. Because this part of the
memory is managed by OS and is transparent to applications.  The only thing
you can do is that you can continuously allocate memory from OS (by
malloc()), to some certain points which the OS senses some memory pressure,
the OS will voluntarily release the page cache to satisfy your memory
allocation. Another thing is that the memory limit of Spark is limited by
maximum JVM heap size. So your memory request from your Spark application is
actually handled by JVM not the OS.


Hope this answer can help you!


Wei




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to