Re: Advice on restarting HDFS in a cron

Rakhi Khatwani Sat, 25 Apr 2009 22:11:16 -0700

Thanks Aaron.

On Sun, Apr 26, 2009 at 10:37 AM, Aaron Kimball <aa...@cloudera.com> wrote:


> If your logs were being written to the root partition (/dev/sda1), that's
> going to fill up fast. This partition is always <= 10 GB on EC2 and much of
> that space is consumed by the OS install. You should redirect your logs to
> some place under /mnt (/dev/sdb1); that's 160 GB.
>
> - Aaron
>
> On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani <rakhi.khatw...@gmail.com
> >wrote:
>
> > Hi,
> >   I have faced somewhat a similar issue...
> >   i have a couple of map reduce jobs running on EC2... after a week or
> so,
> > i get a no space on device exception while performing any linux
> command...
> > so end up shuttin down hadoop and hbase, clear the logs and then restart
> > them.
> >
> > is there a cleaner way to do it???
> >
> > thanks
> > Raakhi
> >
> > On Fri, Apr 24, 2009 at 11:59 PM, Todd Lipcon <t...@cloudera.com> wrote:
> >
> > > On Fri, Apr 24, 2009 at 11:18 AM, Marc Limotte <mlimo...@feeva.com>
> > wrote:
> > >
> > > > Actually, I'm concerned about performance of map/reduce jobs for a
> > > > long-running cluster.  I.e. it seems to get slower the longer it's
> > > running.
> > > >  After a restart of HDFS, the jobs seems to run faster.  Not
> concerned
> > > about
> > > > the start-up time of HDFS.
> > > >
> > >
> > > Hi Marc,
> > >
> > > Does it sound like this JIRA describes your problem?
> > >
> > > https://issues.apache.org/jira/browse/HADOOP-4766
> > >
> > > If so, restarting just the JT should help with the symptoms. (I say
> > > symptoms
> > > because this is clearly a problem! Hadoop should be stable and
> performant
> > > for months without a cluster restart!)
> > >
> > > -Todd
> > >
> > >
> > > >
> > > > Of course, as you suggest, this could be poor configuration of the
> > > cluster
> > > > on my part; but I'd still like to hear best practices around doing a
> > > > scheduled restart.
> > > >
> > > > Marc
> > > >
> > > > -----Original Message-----
> > > > From: Allen Wittenauer [mailto:a...@yahoo-inc.com]
> > > > Sent: Friday, April 24, 2009 10:17 AM
> > > > To: core-user@hadoop.apache.org
> > > > Subject: Re: Advice on restarting HDFS in a cron
> > > >
> > > >
> > > >
> > > >
> > > > On 4/24/09 9:31 AM, "Marc Limotte" <mlimo...@feeva.com> wrote:
> > > > > I've heard that HDFS starts to slow down after it's been running
> for
> > a
> > > > long
> > > > > time.  And I believe I've experienced this.
> > > >
> > > > We did an upgrade (== complete restart) of a 2000 node instance in
> ~20
> > > > minutes on Wednesday. I wouldn't really consider that 'slow', but
> YMMV.
> > > >
> > > > I suspect people aren't running the secondary name node and therefore
> > > have
> > > > massively large edits file.  The name node appears slow on restart
> > > because
> > > > it has to apply the edits to the fsimage rather than having the
> > secondary
> > > > keep it up to date.
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Marc Limotte
> > > >
> > > > Hi.
> > > >
> > > > I've heard that HDFS starts to slow down after it's been running for
> a
> > > long
> > > > time.  And I believe I've experienced this.   So, I was thinking to
> set
> > > up a
> > > > cron job to execute every week to shutdown HDFS and start it up
> again.
> > > >
> > > > In concept, it would be something like:
> > > >
> > > > 0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh
> > > >
> > > > But I'm wondering if there is a safer way to do this.  In particular:
> > > >
> > > > *         What if a map/reduce job is running when this cron hits.
>  Is
> > > > there a way to suspend jobs while the HDFS restart happens?
> > > >
> > > > *         Should I also restart the mapred daemons?
> > > >
> > > > *         Should I wait some time after "stop-dfs.sh" for things to
> > > settle
> > > > down, before executing "start-dfs.sh"?  Or maybe I should run a
> command
> > > to
> > > > verify that it is stopped before I run the start?
> > > >
> > > > Thanks for any help.
> > > > Marc
> > > >
> > > >
> > > > PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT
> > FOR
> > > > ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A
> > > COMMUNICATION
> > > > PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW,
> > USE,
> > > > DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY
> > > > PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN
> E-MAIL
> > > AND
> > > > PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
> > > >
> > >
> >
>

Re: Advice on restarting HDFS in a cron

Reply via email to