Re: oldWALs: what it is and how can I clean it?

Liam Slusser Wed, 11 Mar 2015 13:59:48 -0700

I just wanted to update this thread as after more investigation I've
figured out why my oldWALs folder wasn't being cleaned up.  I had a look at
the code to ReplicationLogCleaner and it makes this call:


 if (!config.getBoolean(HConstants.REPLICATION_ENABLE_KEY,
        HConstants.REPLICATION_ENABLE_DEFAULT)) {
      LOG.warn("Not configured - allowing all wals to be deleted");
      return;
    }

I searched through my logs and was never able to find that line of text.
So wrote a quick program to run that piece of code, and sure enough it came
back as True.  getBoolean returns the value if it's been defined and if not
returns the default.  And after reading HBASE-3489 replication is enabled
by default these days, which I also verified by looking at
HConstants.REPLICATION_ENABLE_DEFAULT.  I run cloudera CDH5.3 and in the
user interface even with hbase replication set to false, wasn't putting
"hbase.replication" false in the configuration file.  I manually added the
hbase.replication to false in the advance configuration for hbase-site.xml
and restarted hbase and sure enough it deleted all the logs!

So this is probably a bug in CDH, at least in the version that I ran.  I'm
running cloudera manger 5.2.1 with CDH5.3.0-1.cdh5.3.0.p0.30.

thanks,
liam


On Wed, Mar 4, 2015 at 5:18 PM, Liam Slusser <lslus...@gmail.com> wrote:

> So after removing all the replication peers hbase still doesn't want to
> clean up the oldWALs folder.  In the master logs I don't see any errors
> from ReplicationLogCleaner or LogCleaner.  I have my logging set to INFO so
> I'd think I would see something.
>
> Is there anyway to run the ReplicationLogCleaner manually and see the
> output?  Can I write something that calls the right API functions?
>
> thanks,
> liam
>
>
> On Fri, Feb 27, 2015 at 1:50 PM, Nick Dimiduk <ndimi...@gmail.com> wrote:
>
>> I would let the cleaner chore handle the cleanup for you. You don't know
>> the state of all entries in that folder. To that extent, I'd avoid making
>> any direct changes to the content of HBase's working directory, especially
>> while HBase is running...
>>
>> On Fri, Feb 27, 2015 at 1:29 PM, Liam Slusser <lslus...@gmail.com> wrote:
>>
>> > Once I disable/remove the replication, can I just blow away the oldWALs
>> > folder safely?
>> >
>> > On Fri, Feb 27, 2015 at 3:10 AM, Madeleine Piffaretti <
>> > mpiffare...@powerspace.com> wrote:
>> >
>> > > Thanks a lot!
>> > >
>> > > Indeed, we had a replication enable in the past because we used the
>> > > hbase-indexer from NgData (use to replicate data from Hbase to Solr).
>> > > The replication was disable from a long time but the hbase-indexer
>> peer
>> > was
>> > > still activated and so, as you mentioned, the data was keept  to
>> > guarantee
>> > > to not lose data between disable and enable.
>> > >
>> > > I have removed the peer and empty the oldWALs folder.
>> > >
>> > >
>> > >
>> > > 2015-02-27 1:42 GMT+01:00 Liam Slusser <lslus...@gmail.com>:
>> > >
>> > > > Huge thanks, Enis, that was the information I was looking for.
>> > > >
>> > > > Cheers!
>> > > > liam
>> > > >
>> > > >
>> > > > On Thu, Feb 26, 2015 at 3:48 PM, Enis Söztutar <enis....@gmail.com>
>> > > wrote:
>> > > >
>> > > > > @Madeleine,
>> > > > >
>> > > > > The folder gets cleaned regularly by a chore in master. When a WAL
>> > file
>> > > > is
>> > > > > not needed any more for recovery purposes (when HBase can guaratee
>> > > HBase
>> > > > > has flushed all the data in the WAL file), it is moved to the
>> oldWALs
>> > > > > folder for archival. The log stays there until all other
>> references
>> > to
>> > > > the
>> > > > > WAL file are finished. There is currently two services which may
>> keep
>> > > the
>> > > > > files in the archive dir. First is a TTL process, which ensures
>> that
>> > > the
>> > > > > WAL files are kept at least for 10 min. This is mainly for
>> debugging.
>> > > You
>> > > > > can reduce this time by setting hbase.master.logcleaner.ttl
>> > > configuration
>> > > > > property in master. It is by default 600000. The other one is
>> > > > replication.
>> > > > > If you have replication setup, the replication processes will
>> hang on
>> > > to
>> > > > > the WAL files until they are replicated. Even if you disabled the
>> > > > > replication, the files are still referenced.
>> > > > >
>> > > > > You can look at the logs from master from classes (LogCleaner,
>> > > > > TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether the
>> > master
>> > > is
>> > > > > actually running this chore and whether it is getting any
>> exceptions.
>> > > > >
>> > > > > @Liam,
>> > > > > Disabled replication will still hold on to the WAL files because,
>> > > because
>> > > > > it has a guarantee to not lose data between disable and enable.
>> You
>> > can
>> > > > > remove_peer, which frees up the WAL files to be eligible for
>> > deletion.
>> > > > When
>> > > > > you re-add replication peer again, the replication will start from
>> > the
>> > > > > current status, versus if you re-enable a peer, it will continue
>> from
>> > > > where
>> > > > > it left.
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti <
>> > > > > mpiffare...@powerspace.com> wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > The replication is not turned on HBase...
>> > > > > > Does this folder should be clean regularly? Because I have data
>> > from
>> > > > > > december 2014...
>> > > > > >
>> > > > > >
>> > > > > > 2015-02-26 1:40 GMT+01:00 Liam Slusser <lslus...@gmail.com>:
>> > > > > >
>> > > > > > > I'm having this same problem.  I had replication enabled but
>> have
>> > > > since
>> > > > > > > been disabled.  However oldWALs still grows.  There are so
>> many
>> > > files
>> > > > > in
>> > > > > > > there that running "hadoop fs -ls /hbase/oldWALs" runs out of
>> > > memory.
>> > > > > > >
>> > > > > > > On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S <
>> > > nishanth.2...@gmail.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Do you have replication turned on in hbase and  if so is
>> your
>> > > slave
>> > > > > > > >  consuming the replicated data?.
>> > > > > > > >
>> > > > > > > > -Nishanth
>> > > > > > > >
>> > > > > > > > On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti <
>> > > > > > > > mpiffare...@powerspace.com> wrote:
>> > > > > > > >
>> > > > > > > > > Hi all,
>> > > > > > > > >
>> > > > > > > > > We are running out of space in our small hadoop cluster
>> so I
>> > > was
>> > > > > > > checking
>> > > > > > > > > disk usage on HDFS and I saw that most of the space was
>> > > occupied
>> > > > by
>> > > > > > > the*
>> > > > > > > > > /hbase/oldWALs* folder.
>> > > > > > > > >
>> > > > > > > > > I have checked in the "HBase Definitive Book" and others
>> > books,
>> > > > > > > web-site
>> > > > > > > > > and I have also search my issue on google but I didn't
>> find a
>> > > > > proper
>> > > > > > > > > response...
>> > > > > > > > >
>> > > > > > > > > So I would like to know what does this folder, what is use
>> > for
>> > > > and
>> > > > > > also
>> > > > > > > > how
>> > > > > > > > > can I free space from this folder without breaking
>> > > everything...
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > If it's related to a specific version... our cluster is
>> under
>> > > > > > > > > 5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6).
>> > > > > > > > >
>> > > > > > > > > Thx for your help!
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: oldWALs: what it is and how can I clean it?

Reply via email to