Even that is bad.  The problem is that the cost of incorrectly stopping
regionservers is much high than the cost of not stopping regionservers.
 Stopping a regionserver scrambles data locality until all of regions are
compacted.  At the margins, this could decrease performance enough to kill a
cluster that is on the edge.  Not stopping a regionserver means that you
have to say what you mean which isn't a big penalty.

On Fri, Mar 4, 2011 at 5:51 PM, Bill Graham <billgra...@gmail.com> wrote:

> What if we just executed the shutdown only if a running master is
> found on the host running the stop script and it's the only master in
> the cluster?
>
>
> On Thu, Mar 3, 2011 at 4:48 PM, Igor Ranitovic <irani...@gmail.com> wrote:
> > What about adding a simple message box using bash's whiptail?
> >
> > For example:
> >
> > rs=$(cat ${HBASE_CONF_DIR}/regionservers | xargs)
> >
> > if ( whiptail --yesno "Do you want to shutdown the cluster with the
> > following regionserver $rs\n[y/n]" 10 40 )
> > then
> >        # proceed with the shutdown
> > else
> >        # exit
> > fi
> >
> >
> > On 03/02/2011 05:23 PM, Bill Graham wrote:
> >>
> >> Hi,
> >>
> >> We had a troubling experience today that I wanted to share. Our dev
> >> cluster got completely shut down by a developer by mistake, without
> >> said developer even realizing it. Here's how...
> >>
> >> We have multiple sets of HBase configs checked into SVN that
> >> developers can checkout and point their HBASE_CONF_DIR to to easily
> >> change from developing in local mode or testing against our
> >> distributed dev cluster.
> >>
> >> In local mode someone might do something like this:
> >>
> >> bin/start-hbase.sh
> >> bin/hbase shell
> >>
> >> ... do some work ...
> >>
> >> bin/stop-hbase.sh
> >>
> >> The problem arose when a developer accidentally tried to do this with
> >> their HBASE_CONF_DIR pointing to our dev cluster configs. When this
> >> happens, the first command will add another master to the cluster and
> >> the last command will shut down the entire cluster. I assume this
> >> happens via Zookeeper somehow, since we don't have ssh keys to
> >> remotely start/stop as the user running the processes.
> >>
> >> So the question is, is this a bug or a feature? If it's a feature it
> >> seems like an incredibly dangerous one. Once our live cluster is
> >> running, those configs will also be needed on the client so really bad
> >> things could happen by mistake.
> >>
> >> thanks,
> >> Bill
> >>
> >
> >
>

Reply via email to