To clear one thing up: the space taken up by data that Spark caches on disk is not related to YARN's "local resource" / "application cache" concept. The latter is a way that YARN provides for distributing bits to worker nodes. The former is just usage of disk by Spark, which happens to be in a local directory that YARN gives it. Based on its title, if YARN-882 were resolved, it would do nothing to limit the amount of on-disk cache space Spark could use.
-Sandy On Mon, Jul 13, 2015 at 6:57 AM, Peter Rudenko <petro.rude...@gmail.com> wrote: > Hi Andrew, here's what i found. Maybe would be relevant for people with > the same issue: > > 1) There's 3 types of local resources in YARN (public, private, > application). More about it here: > http://hortonworks.com/blog/management-of-application-dependencies-in-yarn/ > > 2) Spark cache is of application type of resource. > > 3) Currently it's not possible to specify quota for application resources ( > https://issues.apache.org/jira/browse/YARN-882) > > 4) The only it's possible to specify these 2 settings: > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > - The maximum percentage of disk space utilization allowed after which a > disk is marked as bad. Values can range from 0.0 to 100.0. If the value is > greater than or equal to 100, the nodemanager will check for full disk. > This applies to yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs. > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb - The > minimum space that must be available on a disk for it to be used. This > applies to yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs. > > 5) Yarn's cache cleanup doesn't cleaned app resources: > https://github.com/apache/hadoop/blob/8d58512d6e6d9fe93784a9de2af0056bcc316d96/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java#L511 > > As i understood application resources cleaned when spark application > correctly terminates (using sc.stop()). But in my case when it fills all > disk space it was stucked and couldn't stop correctly. After i restarted > yarn i don't know how easily trigger cache cleanup except of manually on > all the nodes. > > Thanks, > Peter Rudenko > > On 2015-07-10 20:07, Andrew Or wrote: > > Hi Peter, > > AFAIK Spark assumes infinite disk space, so there isn't really a way to > limit how much space it uses. Unfortunately I'm not aware of a simpler > workaround than to simply provision your cluster with more disk space. By > the way, are you sure that it's disk space that exceeded the limit, but not > the number of inodes? If it's the latter, maybe you could control the > ulimit of the container. > > To answer your other question: if it can't persist to disk then yes it > will fail. It will only recompute from the data source if for some reason > someone evicted our blocks from memory, but that shouldn't happen in your > case since your'e using MEMORY_AND_DISK_SER. > > -Andrew > > > 2015-07-10 3:51 GMT-07:00 Peter Rudenko <petro.rude...@gmail.com>: > >> Hi, i have a spark ML worklflow. It uses some persist calls. When i >> launch it with 1 tb dataset - it puts down all cluster, becauses it fills >> all disk space at /yarn/nm/usercache/root/appcache: >> http://i.imgur.com/qvRUrOp.png >> >> I found a yarn settings: >> *yarn*.nodemanager.localizer.*cache*.target-size-mb - Target size of >> localizer cache in MB, per nodemanager. It is a target retention size that >> only includes resources with PUBLIC and PRIVATE visibility and excludes >> resources with APPLICATION visibility >> >> But it excludes resources with APPLICATION visibility, and spark cache as >> i understood is of APPLICATION type. >> >> Is it possible to restrict a disk space for spark application? Will spark >> fail if it wouldn't be able to persist on disk >> (StorageLevel.MEMORY_AND_DISK_SER) or it would recompute from data source? >> >> Thanks, >> Peter Rudenko >> >> >> >> >> > >