What is the difference between .90 and .90_master_rewrite?

Thanks.

On Fri, Jan 21, 2011 at 2:29 PM, Lars George <lars.geo...@gmail.com> wrote:

> Hi Wayne,
>
> 0.90.0 is out. Get it while it's hot from the HBase home page.
>
> Lars
>
> On Jan 21, 2011, at 20:22, Wayne <wav...@gmail.com> wrote:
>
> > I enthusiastically created a ticket:
> > https://issues.apache.org/jira/browse/HBASE-3463
> >
> > This might be a dumb question I should already know the answer to...but
> when
> > is .90 coming out and what is its current state? Isn't there an RC
> > out? We are on 0.89.20100924 and thought that was the latest for us to
> work
> > off of...
> >
> > As always thanks for the detailed responses. FYI: I ended up reformatting
> as
> > I could drop all tables and get phantom regions to go away after several
> > restarts but the .META. table was still stuck reporting as 150MB with no
> > tables and after issuing major_compact (which never seemed to have any
> > affect)...
> >
> > Thanks.
> >
> >
> > On Fri, Jan 21, 2011 at 1:47 PM, Stack <st...@duboce.net> wrote:
> >
> >> On Fri, Jan 21, 2011 at 4:51 AM, Wayne <wav...@gmail.com> wrote:
> >>> After several hours I have figured out how to get the Disable command
> to
> >>> work and how to delete manually, but in the process there are 4
> problems
> >> I
> >>> encountered that I think are areas that could be improved (or my
> >>> understanding improved).
> >>>
> >>> 1) The client timeout is used for the disable command which was my
> >> problem.
> >>> Does this totally make sense? Should a DML minded timeout be used for
> DDL
> >>> statements that we know can take a very long time normally with a large
> >>> cluster?
> >>>
> >>
> >> Sorry Wayne.  I meant to respond yesterday to your original query.
> >>
> >> Enable/Disable has been redone in 0.90.  Now there are added
> >> enabling/disabling states that are maintained up in zk and in shell
> >> there are commands is_enabled and is_disabled.  We still have the same
> >> (DML) timeout (sortof -- see below for more) but at least now if it
> >> times out, you are not hosed.  The disable or enable process is still
> >> running and you can query its state.  There is also notion of async
> >> enable/disable though this latter facility is not exposed in shell,
> >> only in the HBaseAdmin API.
> >>
> >>
> >>> 2) If the disable command fails the first time it does not "roll back".
> >> The
> >>> ONLY way to proceed is to enable and then try to disable again. The
> first
> >>> disable attempt is all that seems to work. Subsequent disable
> statements
> >>> usually work without errors but never seem to "work". The entire table
> >>> should be disabled after issuing this command or the entire table
> should
> >>> still be enabled. I was caught in this half disabled or mostly disabled
> >>> which was very frustating.
> >>>
> >>
> >> Sorry about that.   Should be better in 0.90.0.
> >>
> >> Things should run a bit faster in 0.90.0 too because disable used to
> >> include an update of .META. per region plus a close of all regions
> >> that make up the table.  In 0.90.0 there is no longer the .META.
> >> update and close is more prompt now; in the past close would wait on
> >> any running compactions to complete before proceeding.  In 0.90.0
> >> we'll no interrupt the running compaction so close happens the sooner.
> >>
> >> There is room for a bunch more improvement. For example, deleting a
> >> table, there should be short-circuit that punts on flush of in-memory
> >> state and clean-close of open regions.
> >>
> >>> 3) The biggest issue of all is why certain regions do not report back
> to
> >> the
> >>> disable command. What are the various states of a region that could
> cause
> >>> this? Compaction I know is one, what else could cause the disable
> command
> >> to
> >>> take too long? Shouldn't a disable force itself through and wait long
> >> enough
> >>> to be able to disable every region? Again a long wait time or a more
> >>> forceful operation would help.
> >>>
> >>
> >> It wasn't that smart in 0.20/0.89.  Its still pretty dumb but better in
> >> 0.90.0.
> >>
> >> Master process runs the enable/disable process in both old and new
> >> HBase.  In 0.20/0.89, it was a sync process w/ master waiting on
> >> regions to flip to 'offline' after successful close.  The state of
> >> disabledness was when all regions in table had 'offline' state.  Any
> >> hiccup, a problem closing or a failure to update .META. w/ offline per
> >> region would bork the disabling process.  It was super fragile.  We
> >> tried to talk it up as so.
> >>
> >> In 0.90, client queues in master an executor that flips table to
> >> disabling in zk and then in parallel sends out unassigns of all table
> >> regions.  The executor then hangs around with a more DDL-like timeout
> >> of hbase.bulk.assignment.waiton.empty.rit (10minutes by default).
> >> Meantime clients can check state of the disable.   After all unassigns
> >> complete, the table is flipped to disabled.
> >>
> >>
> >>> 4) Through all of the attempts to disable I saw regions coming and
> going
> >> and
> >>> nothing was consistent. The UI showed the table as disabled and listed
> 1
> >>> region in the table (there were 1000s). The node view listed several
> >> other
> >>> regions but not the same one as the table view. It was a very strange
> >>> situation. The UI to browse the tables and regions is great but it
> would
> >> be
> >>> even better if it gave a 100% view of regions and their current states.
> A
> >>> summary view of region counts per table based on state or status would
> be
> >>> fantastic.
> >>
> >> Please file a JIRA.  Sounds like good idea.  We could hoist stuff up
> >> out of hbck tool up into UI.
> >>
> >>
> >>> There is a compaction count, but what about in split, read/rite
> >>> lock, disabled, etc. What is the precise list of regions states that
> >> could
> >>> occur and show a summary count per state as well as detailed state for
> >> each
> >>> specific region in the list. Fundamentally this is the health monitor
> of
> >> the
> >>> system and as a dba I really need to know the 100% count of regions and
> >>> where they are all at in terms of availability. Are they disabled,
> >> blocked
> >>> for writes, blocked for reads, in compaction, etc. etc. If there are
> >> various
> >>> states that cause disabling to be blocked it can be reported here so
> that
> >> I
> >>> at least know when a disable command can be executed successfully (and
> >> this
> >>> should be documented).
> >>>
> >>
> >>
> >> Please file a JIRA.  This is great stuff.
> >>
> >> Sorry for pain caused messing w/ broke enable/disable.  It should be
> >> better in 0.90 and easier to fix if bugs.
> >>
> >> St.Ack
> >>
> >>
> >>> Thanks
> >>>
> >>> On Thu, Jan 20, 2011 at 9:01 PM, Wayne <wav...@gmail.com> wrote:
> >>>
> >>>> I need to delete some tables and I am not sure the best way to do it.
> >> The
> >>>> shell does not work. The disable command says it runs ok but every
> time
> >> I
> >>>> run drop or truncate I get an exception that says the table is not
> >>>> disabled.  The UI shows it as disabled but truncate/drop still do not
> >> work.
> >>>> I have even tried to restart the cluster as sometimes that makes the
> >> disable
> >>>> "stick".
> >>>>
> >>>> What is the best way to delete a table manually? My assumption is that
> >> with
> >>>> 10k regions in 3 tables that I need to delete that the shell is not
> >> going to
> >>>> work. How can I do this without a completely fresh install of
> >> everything?
> >>>> How can the data/tables be removed manually without too much pain?
> >>>>
> >>>> Thanks.
> >>>>
> >>>
> >>
>

Reply via email to