[ https://issues.apache.org/jira/browse/SPARK-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424242#comment-15424242 ]
Laurent Hoss commented on SPARK-12430: -------------------------------------- In spark 2.0 this should be less an issue, at least when using 'coarse grained' mode after SPARK-12330 was fixed. [~fede-bis] any feedback if this has been resolved for you ? [~dragos] said > Are you using 1.6? In that case, the blockmgr directory should really be inside your Mesos sandbox, not under /tmp. At least, that's what I see when I try out. Let me note that one big drawback having those dirs in the sandbox (/tmp) is that the shuffling cannot be parallized over multiple disks. That's why we prefer to set `spark.local.dirs` to use a number of partitions (on different disks), and ensure those get cleaned regularly using some 'find -mtime' magic on the blockmgr-* dirs (non ideal hower, so we hope the migration to spark-2 improves the situation) > Temporary folders do not get deleted after Task completes causing problems > with disk space. > ------------------------------------------------------------------------------------------- > > Key: SPARK-12430 > URL: https://issues.apache.org/jira/browse/SPARK-12430 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.5.1, 1.5.2, 1.6.0 > Environment: Ubuntu server > Reporter: Fede Bar > > We are experiencing an issue with automatic /tmp folder deletion after > framework completes. Completing a M/R job using Spark 1.5.2 (same behavior as > Spark 1.5.1) over Mesos will not delete some temporary folders causing free > disk space on server to exhaust. > Behavior of M/R job using Spark 1.4.1 over Mesos cluster: > - Launched using spark-submit on one cluster node. > - Following folders are created: */tmp/mesos/slaves/id#* , */tmp/spark-#/* , > */tmp/spark-#/blockmgr-#* > - When task is completed */tmp/spark-#/* gets deleted along with > */tmp/spark-#/blockmgr-#* sub-folder. > Behavior of M/R job using Spark 1.5.2 over Mesos cluster (same identical job): > - Launched using spark-submit on one cluster node. > - Following folders are created: */tmp/mesos/mesos/slaves/id** * , > */tmp/spark-***/ * ,{color:red} /tmp/blockmgr-***{color} > - When task is completed */tmp/spark-***/ * gets deleted but NOT shuffle > container folder {color:red} /tmp/blockmgr-***{color} > Unfortunately, {color:red} /tmp/blockmgr-***{color} can account for several > GB depending on the job that ran. Over time this causes disk space to become > full with consequences that we all know. > Running a shell script would probably work but it is difficult to identify > folders in use by a running M/R or stale folders. I did notice similar issues > opened by other users marked as "resolved", but none seems to exactly match > the above behavior. > I really hope someone has insights on how to fix it. > Thank you very much! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org