yangping wu created SPARK-10774: ----------------------------------- Summary: Put different event log to different directory according to some conditions Key: SPARK-10774 URL: https://issues.apache.org/jira/browse/SPARK-10774 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.4.1 Reporter: yangping wu Priority: Minor
Right now, Spark logging all event logs(inprogress or finished) into the some directory(configuration by the **spark.eventLog.dir** parameter) as following: {noformat} [yangping...@l-sparkcluster.data.cn5 /]$ sudo hadoop fs -ls /spark-jobs/eventLog Found 58 items -rwxrwxrwx 3 spark aaa 8438 2015-09-17 15:14 /spark-jobs/eventLog/application_1440152921247_0047_1.lz4 -rwxrwxrwx 3 spark aaa 44002 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_1 -rwxrwxrwx 3 spark aaa 44696 2015-09-17 15:15 /spark-jobs/eventLog/application_1440152921247_0190_2 -rwxrwxrwx 3 spark aaa 40813 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_1 -rwxrwxrwx 3 spark aaa 44680 2015-09-17 15:25 /spark-jobs/eventLog/application_1440152921247_0191_2 -rwxrwxrwx 3 spark aaa 42572 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_1 -rwxrwxrwx 3 spark aaa 44680 2015-09-17 15:36 /spark-jobs/eventLog/application_1440152921247_0192_2 -rwxrwxrwx 3 spark aaa 45052 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_1 -rwxrwxrwx 3 spark aaa 44688 2015-09-17 16:09 /spark-jobs/eventLog/application_1440152921247_0193_2 -rwxrwxrwx 3 spark aaa 41686 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_1 -rwxrwxrwx 3 spark aaa 44522 2015-09-17 16:11 /spark-jobs/eventLog/application_1440152921247_0194_2 -rwxrwxrwx 3 spark aaa 32261 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_1 -rwxrwxrwx 3 spark aaa 31178 2015-09-17 16:13 /spark-jobs/eventLog/application_1440152921247_0195_2 -rwxrwxrwx 3 spark aaa 39124467712 2015-09-18 11:58 /spark-jobs/eventLog/application_1440152921247_0205_1.inprogress -rwxrwxrwx 3 spark aaa 790045092 2015-09-18 20:40 /spark-jobs/eventLog/application_1440152921247_0206 ........ {noformat} As time goes by, there will be a lot of event log in the **spark.eventLog.dir** directory and not easy to manage. In hadoop, there are two types of directory to save different type event logs: done-dir and intermediate-done-dir, configuration by **mapreduce.jobhistory.done-dir** and **mapreduce.jobhistory.intermediate-done-dir** respectively. and in the "done-dir", event logs were save to different directory according to the running time of the job as following: {noformat} [yangping...@l-sparkcluster.data.cn5 /]$sudo hadoop fs -ls /hadoop-jobs/done/2015/09/ Found 23 items drwxrwxrwx - hadoop supergroup 0 2015-09-04 16:59 /hadoop-jobs/done/2015/09/01 drwxrwxrwx - hadoop supergroup 0 2015-09-05 16:59 /hadoop-jobs/done/2015/09/02 drwxrwxrwx - hadoop supergroup 0 2015-09-06 16:59 /hadoop-jobs/done/2015/09/03 drwxrwxrwx - hadoop supergroup 0 2015-09-07 16:59 /hadoop-jobs/done/2015/09/04 drwxrwxrwx - hadoop supergroup 0 2015-09-08 16:59 /hadoop-jobs/done/2015/09/05 drwxrwxrwx - hadoop supergroup 0 2015-09-09 16:59 /hadoop-jobs/done/2015/09/06 drwxrwxrwx - hadoop supergroup 0 2015-09-10 16:59 /hadoop-jobs/done/2015/09/07 drwxrwxrwx - hadoop supergroup 0 2015-09-11 16:59 /hadoop-jobs/done/2015/09/08 drwxrwxrwx - hadoop supergroup 0 2015-09-12 16:59 /hadoop-jobs/done/2015/09/09 drwxrwxrwx - hadoop supergroup 0 2015-09-13 16:59 /hadoop-jobs/done/2015/09/10 drwxrwx--- - hadoop supergroup 0 2015-09-14 16:59 /hadoop-jobs/done/2015/09/11 drwxrwx--- - hadoop supergroup 0 2015-09-15 16:59 /hadoop-jobs/done/2015/09/12 drwxrwxrwx - hadoop supergroup 0 2015-09-16 16:59 /hadoop-jobs/done/2015/09/13 drwxrwxrwx - hadoop supergroup 0 2015-09-17 16:59 /hadoop-jobs/done/2015/09/14 drwxrwxrwx - hadoop supergroup 0 2015-09-18 16:59 /hadoop-jobs/done/2015/09/15 drwxrwxrwx - hadoop supergroup 0 2015-09-19 16:59 /hadoop-jobs/done/2015/09/16 drwxrwxrwx - hadoop supergroup 0 2015-09-20 16:59 /hadoop-jobs/done/2015/09/17 drwxrwx--- - hadoop supergroup 0 2015-09-21 16:59 /hadoop-jobs/done/2015/09/18 drwxrwx--- - hadoop supergroup 0 2015-09-22 16:59 /hadoop-jobs/done/2015/09/19 drwxrwx--- - hadoop supergroup 0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/20 drwxrwx--- - hadoop supergroup 0 2015-09-23 16:59 /hadoop-jobs/done/2015/09/21 drwxrwx--- - hadoop supergroup 0 2015-09-22 23:43 /hadoop-jobs/done/2015/09/22 drwxrwx--- - hadoop supergroup 0 2015-09-23 18:55 /hadoop-jobs/done/2015/09/23 {noformat} In Spark, I think we can do the same thing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org