Re: Corrupt WAL

Adam J. Shook Wed, 13 Jun 2018 08:35:27 -0700

Looking at the log I see that the last two entries are COMPACTION_START of
one RFile immediately followed by a COMPACTION_START of a separate RFile
which (I believe) would lead to the error.  Would this necessarily be an
issue if the compactions are for separate RFiles?


This is a dev cluster and I don't necessarily care about it, but is there a
(good) means to do WAL log surgery?  I imagine I can just chop off bytes
until the log is parseable and missing the info about the compactions.

On Tue, Jun 12, 2018 at 2:32 PM, Keith Turner <[email protected]> wrote:

> On Tue, Jun 12, 2018 at 12:10 PM, Adam J. Shook <[email protected]>
> wrote:
> > Yes, that is the error.  I'll inspect the logs and report back.
>
> Ok.  The LogReader command has a mechanism to filter which tablet is
> displayed.  If the walog has  alot of data in it, may need to use
> this.
>
> Also, be aware that only 5 mutations are shown for a "many mutations"
> objects in the walog.   The -m options changes this.  May want to see
> more when deciding if the info in the log is important.
>
>
> >
> > On Tue, Jun 12, 2018 at 10:14 AM, Keith Turner <[email protected]> wrote:
> >>
> >> Is the message you are seeing "COMPACTION_FINISH (without preceding
> >> COMPACTION_START)" ?  That messages indicates that the WALs are
> >> incomplete, probably as a result of the NN problems.  Could do the
> >> following :
> >>
> >> 1) Run the following command to see whats in the log.  Need to see
> >> what is there for the root tablet.
> >>
> >>    accumulo org.apache.accumulo.tserver.logger.LogReader
> >>
> >> 2) Replace the log file with an empty file after seeing if there is
> >> anything important in it.
> >>
> >> I think the list of WALs for the root tablet is stored in ZK at
> >> /accumulo/<id>/walogs
> >>
> >> On Mon, Jun 11, 2018 at 5:26 PM, Adam J. Shook <[email protected]>
> >> wrote:
> >> > Hey all,
> >> >
> >> > The root tablet on one of our dev systems isn't loading due to an
> >> > illegal
> >> > state exception -- COMPACTION_FINISH preceding COMPACTION_START.
> What'd
> >> > be
> >> > the best way to mitigate this issue?  This was likely caused due to
> both
> >> > of
> >> > our NameNodes failing.
> >> >
> >> > Thank you,
> >> > --Adam
> >
> >
>

Re: Corrupt WAL

Reply via email to