Thanks Aaron. On Sun, Apr 26, 2009 at 10:37 AM, Aaron Kimball <aa...@cloudera.com> wrote:
> If your logs were being written to the root partition (/dev/sda1), that's > going to fill up fast. This partition is always <= 10 GB on EC2 and much of > that space is consumed by the OS install. You should redirect your logs to > some place under /mnt (/dev/sdb1); that's 160 GB. > > - Aaron > > On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani <rakhi.khatw...@gmail.com > >wrote: > > > Hi, > > I have faced somewhat a similar issue... > > i have a couple of map reduce jobs running on EC2... after a week or > so, > > i get a no space on device exception while performing any linux > command... > > so end up shuttin down hadoop and hbase, clear the logs and then restart > > them. > > > > is there a cleaner way to do it??? > > > > thanks > > Raakhi > > > > On Fri, Apr 24, 2009 at 11:59 PM, Todd Lipcon <t...@cloudera.com> wrote: > > > > > On Fri, Apr 24, 2009 at 11:18 AM, Marc Limotte <mlimo...@feeva.com> > > wrote: > > > > > > > Actually, I'm concerned about performance of map/reduce jobs for a > > > > long-running cluster. I.e. it seems to get slower the longer it's > > > running. > > > > After a restart of HDFS, the jobs seems to run faster. Not > concerned > > > about > > > > the start-up time of HDFS. > > > > > > > > > > Hi Marc, > > > > > > Does it sound like this JIRA describes your problem? > > > > > > https://issues.apache.org/jira/browse/HADOOP-4766 > > > > > > If so, restarting just the JT should help with the symptoms. (I say > > > symptoms > > > because this is clearly a problem! Hadoop should be stable and > performant > > > for months without a cluster restart!) > > > > > > -Todd > > > > > > > > > > > > > > Of course, as you suggest, this could be poor configuration of the > > > cluster > > > > on my part; but I'd still like to hear best practices around doing a > > > > scheduled restart. > > > > > > > > Marc > > > > > > > > -----Original Message----- > > > > From: Allen Wittenauer [mailto:a...@yahoo-inc.com] > > > > Sent: Friday, April 24, 2009 10:17 AM > > > > To: core-user@hadoop.apache.org > > > > Subject: Re: Advice on restarting HDFS in a cron > > > > > > > > > > > > > > > > > > > > On 4/24/09 9:31 AM, "Marc Limotte" <mlimo...@feeva.com> wrote: > > > > > I've heard that HDFS starts to slow down after it's been running > for > > a > > > > long > > > > > time. And I believe I've experienced this. > > > > > > > > We did an upgrade (== complete restart) of a 2000 node instance in > ~20 > > > > minutes on Wednesday. I wouldn't really consider that 'slow', but > YMMV. > > > > > > > > I suspect people aren't running the secondary name node and therefore > > > have > > > > massively large edits file. The name node appears slow on restart > > > because > > > > it has to apply the edits to the fsimage rather than having the > > secondary > > > > keep it up to date. > > > > > > > > > > > > -----Original Message----- > > > > From: Marc Limotte > > > > > > > > Hi. > > > > > > > > I've heard that HDFS starts to slow down after it's been running for > a > > > long > > > > time. And I believe I've experienced this. So, I was thinking to > set > > > up a > > > > cron job to execute every week to shutdown HDFS and start it up > again. > > > > > > > > In concept, it would be something like: > > > > > > > > 0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh > > > > > > > > But I'm wondering if there is a safer way to do this. In particular: > > > > > > > > * What if a map/reduce job is running when this cron hits. > Is > > > > there a way to suspend jobs while the HDFS restart happens? > > > > > > > > * Should I also restart the mapred daemons? > > > > > > > > * Should I wait some time after "stop-dfs.sh" for things to > > > settle > > > > down, before executing "start-dfs.sh"? Or maybe I should run a > command > > > to > > > > verify that it is stopped before I run the start? > > > > > > > > Thanks for any help. > > > > Marc > > > > > > > > > > > > PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT > > FOR > > > > ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A > > > COMMUNICATION > > > > PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, > > USE, > > > > DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY > > > > PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN > E-MAIL > > > AND > > > > PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM. > > > > > > > > > >