[ https://issues.apache.org/jira/browse/SPARK-12427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen reassigned SPARK-12427: ---------------------------------- Assignee: Josh Rosen > spark builds filling up jenkins' disk > ------------------------------------- > > Key: SPARK-12427 > URL: https://issues.apache.org/jira/browse/SPARK-12427 > Project: Spark > Issue Type: Bug > Components: Build > Reporter: shane knapp > Assignee: Josh Rosen > Priority: Critical > Labels: build, jenkins > Attachments: graph.png, jenkins_disk_usage.txt > > > problem summary: > a few spark builds are filling up the jenkins master's disk with millions of > little log files as build artifacts. > currently, we have a raid10 array set up with 5.4T of storage. we're > currently using 4.0T, 99.9% of which is spark unit test and junit logs. > the worst offenders, with more than 100G of disk usage per job, are: > 193G ./Spark-1.6-Maven-with-YARN > 194G ./Spark-1.5-Maven-with-YARN > 205G ./Spark-1.6-Maven-pre-YARN > 216G ./Spark-1.5-Maven-pre-YARN > 387G ./Spark-Master-Maven-with-YARN > 420G ./Spark-Master-Maven-pre-YARN > 520G ./Spark-1.6-SBT > 733G ./Spark-1.5-SBT > 812G ./Spark-Master-SBT > i have attached a full report w/all builds listed as well. > each of these builds is keeping their build history for 90 days. > keep in mind that for each new matrix build, we're looking at another > 200-500G per for the SBT/pre-YARN/with-YARN jobs. > a straw man, back of napkin estimate for spark 1.7 is 2T of additional disk > usage. > on the hardware config side, we can move from raid10 to raid 5 and get ~3T > additional storage. if we ditch raid altogether and put in bigger disks, we > can get a total of 16-20T storage on master. another option is to have a NFS > mount to a deep storage server. all of these options will require > significant downtime. > quesitons: > * can we lower the number of days that we keep build information? > * there are other options in jenkins that we can set as well: max number of > builds to keep, max # days to keep artifacts, max # of builds to keep > w/artifacts > * can we make the junit and unit test logs smaller (probably not) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org