@Madeleine, The folder gets cleaned regularly by a chore in master. When a WAL file is not needed any more for recovery purposes (when HBase can guaratee HBase has flushed all the data in the WAL file), it is moved to the oldWALs folder for archival. The log stays there until all other references to the WAL file are finished. There is currently two services which may keep the files in the archive dir. First is a TTL process, which ensures that the WAL files are kept at least for 10 min. This is mainly for debugging. You can reduce this time by setting hbase.master.logcleaner.ttl configuration property in master. It is by default 600000. The other one is replication. If you have replication setup, the replication processes will hang on to the WAL files until they are replicated. Even if you disabled the replication, the files are still referenced.
You can look at the logs from master from classes (LogCleaner, TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether the master is actually running this chore and whether it is getting any exceptions. @Liam, Disabled replication will still hold on to the WAL files because, because it has a guarantee to not lose data between disable and enable. You can remove_peer, which frees up the WAL files to be eligible for deletion. When you re-add replication peer again, the replication will start from the current status, versus if you re-enable a peer, it will continue from where it left. On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti < mpiffare...@powerspace.com> wrote: > Hi, > > The replication is not turned on HBase... > Does this folder should be clean regularly? Because I have data from > december 2014... > > > 2015-02-26 1:40 GMT+01:00 Liam Slusser <lslus...@gmail.com>: > > > I'm having this same problem. I had replication enabled but have since > > been disabled. However oldWALs still grows. There are so many files in > > there that running "hadoop fs -ls /hbase/oldWALs" runs out of memory. > > > > On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S <nishanth.2...@gmail.com> > > wrote: > > > > > Do you have replication turned on in hbase and if so is your slave > > > consuming the replicated data?. > > > > > > -Nishanth > > > > > > On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti < > > > mpiffare...@powerspace.com> wrote: > > > > > > > Hi all, > > > > > > > > We are running out of space in our small hadoop cluster so I was > > checking > > > > disk usage on HDFS and I saw that most of the space was occupied by > > the* > > > > /hbase/oldWALs* folder. > > > > > > > > I have checked in the "HBase Definitive Book" and others books, > > web-site > > > > and I have also search my issue on google but I didn't find a proper > > > > response... > > > > > > > > So I would like to know what does this folder, what is use for and > also > > > how > > > > can I free space from this folder without breaking everything... > > > > > > > > > > > > If it's related to a specific version... our cluster is under > > > > 5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6). > > > > > > > > Thx for your help! > > > > > > > > > >