(Apologies for hitting send too soon and more vendor specific info.) Liam, FYI CM 5.2.z running CDH5.3.z isn't a supported configuration[1] and might be the source of your problem.
[1]: http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/pcm_cdh_cm.html On Wed, Mar 11, 2015 at 4:02 PM, Sean Busbey <bus...@cloudera.com> wrote: > Thanks for the follow up Liam! I'll add this as a bug against CDH. > > On Wed, Mar 11, 2015 at 3:58 PM, Liam Slusser <lslus...@gmail.com> wrote: > >> I just wanted to update this thread as after more investigation I've >> figured out why my oldWALs folder wasn't being cleaned up. I had a look >> at >> the code to ReplicationLogCleaner and it makes this call: >> >> if (!config.getBoolean(HConstants.REPLICATION_ENABLE_KEY, >> HConstants.REPLICATION_ENABLE_DEFAULT)) { >> LOG.warn("Not configured - allowing all wals to be deleted"); >> return; >> } >> >> I searched through my logs and was never able to find that line of text. >> So wrote a quick program to run that piece of code, and sure enough it >> came >> back as True. getBoolean returns the value if it's been defined and if >> not >> returns the default. And after reading HBASE-3489 replication is enabled >> by default these days, which I also verified by looking at >> HConstants.REPLICATION_ENABLE_DEFAULT. I run cloudera CDH5.3 and in the >> user interface even with hbase replication set to false, wasn't putting >> "hbase.replication" false in the configuration file. I manually added the >> hbase.replication to false in the advance configuration for hbase-site.xml >> and restarted hbase and sure enough it deleted all the logs! >> >> So this is probably a bug in CDH, at least in the version that I ran. I'm >> running cloudera manger 5.2.1 with CDH5.3.0-1.cdh5.3.0.p0.30. >> >> thanks, >> liam >> >> >> On Wed, Mar 4, 2015 at 5:18 PM, Liam Slusser <lslus...@gmail.com> wrote: >> >> > So after removing all the replication peers hbase still doesn't want to >> > clean up the oldWALs folder. In the master logs I don't see any errors >> > from ReplicationLogCleaner or LogCleaner. I have my logging set to >> INFO so >> > I'd think I would see something. >> > >> > Is there anyway to run the ReplicationLogCleaner manually and see the >> > output? Can I write something that calls the right API functions? >> > >> > thanks, >> > liam >> > >> > >> > On Fri, Feb 27, 2015 at 1:50 PM, Nick Dimiduk <ndimi...@gmail.com> >> wrote: >> > >> >> I would let the cleaner chore handle the cleanup for you. You don't >> know >> >> the state of all entries in that folder. To that extent, I'd avoid >> making >> >> any direct changes to the content of HBase's working directory, >> especially >> >> while HBase is running... >> >> >> >> On Fri, Feb 27, 2015 at 1:29 PM, Liam Slusser <lslus...@gmail.com> >> wrote: >> >> >> >> > Once I disable/remove the replication, can I just blow away the >> oldWALs >> >> > folder safely? >> >> > >> >> > On Fri, Feb 27, 2015 at 3:10 AM, Madeleine Piffaretti < >> >> > mpiffare...@powerspace.com> wrote: >> >> > >> >> > > Thanks a lot! >> >> > > >> >> > > Indeed, we had a replication enable in the past because we used the >> >> > > hbase-indexer from NgData (use to replicate data from Hbase to >> Solr). >> >> > > The replication was disable from a long time but the hbase-indexer >> >> peer >> >> > was >> >> > > still activated and so, as you mentioned, the data was keept to >> >> > guarantee >> >> > > to not lose data between disable and enable. >> >> > > >> >> > > I have removed the peer and empty the oldWALs folder. >> >> > > >> >> > > >> >> > > >> >> > > 2015-02-27 1:42 GMT+01:00 Liam Slusser <lslus...@gmail.com>: >> >> > > >> >> > > > Huge thanks, Enis, that was the information I was looking for. >> >> > > > >> >> > > > Cheers! >> >> > > > liam >> >> > > > >> >> > > > >> >> > > > On Thu, Feb 26, 2015 at 3:48 PM, Enis Söztutar < >> enis....@gmail.com> >> >> > > wrote: >> >> > > > >> >> > > > > @Madeleine, >> >> > > > > >> >> > > > > The folder gets cleaned regularly by a chore in master. When a >> WAL >> >> > file >> >> > > > is >> >> > > > > not needed any more for recovery purposes (when HBase can >> guaratee >> >> > > HBase >> >> > > > > has flushed all the data in the WAL file), it is moved to the >> >> oldWALs >> >> > > > > folder for archival. The log stays there until all other >> >> references >> >> > to >> >> > > > the >> >> > > > > WAL file are finished. There is currently two services which >> may >> >> keep >> >> > > the >> >> > > > > files in the archive dir. First is a TTL process, which ensures >> >> that >> >> > > the >> >> > > > > WAL files are kept at least for 10 min. This is mainly for >> >> debugging. >> >> > > You >> >> > > > > can reduce this time by setting hbase.master.logcleaner.ttl >> >> > > configuration >> >> > > > > property in master. It is by default 600000. The other one is >> >> > > > replication. >> >> > > > > If you have replication setup, the replication processes will >> >> hang on >> >> > > to >> >> > > > > the WAL files until they are replicated. Even if you disabled >> the >> >> > > > > replication, the files are still referenced. >> >> > > > > >> >> > > > > You can look at the logs from master from classes (LogCleaner, >> >> > > > > TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether the >> >> > master >> >> > > is >> >> > > > > actually running this chore and whether it is getting any >> >> exceptions. >> >> > > > > >> >> > > > > @Liam, >> >> > > > > Disabled replication will still hold on to the WAL files >> because, >> >> > > because >> >> > > > > it has a guarantee to not lose data between disable and enable. >> >> You >> >> > can >> >> > > > > remove_peer, which frees up the WAL files to be eligible for >> >> > deletion. >> >> > > > When >> >> > > > > you re-add replication peer again, the replication will start >> from >> >> > the >> >> > > > > current status, versus if you re-enable a peer, it will >> continue >> >> from >> >> > > > where >> >> > > > > it left. >> >> > > > > >> >> > > > > >> >> > > > > >> >> > > > > On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti < >> >> > > > > mpiffare...@powerspace.com> wrote: >> >> > > > > >> >> > > > > > Hi, >> >> > > > > > >> >> > > > > > The replication is not turned on HBase... >> >> > > > > > Does this folder should be clean regularly? Because I have >> data >> >> > from >> >> > > > > > december 2014... >> >> > > > > > >> >> > > > > > >> >> > > > > > 2015-02-26 1:40 GMT+01:00 Liam Slusser <lslus...@gmail.com>: >> >> > > > > > >> >> > > > > > > I'm having this same problem. I had replication enabled >> but >> >> have >> >> > > > since >> >> > > > > > > been disabled. However oldWALs still grows. There are so >> >> many >> >> > > files >> >> > > > > in >> >> > > > > > > there that running "hadoop fs -ls /hbase/oldWALs" runs out >> of >> >> > > memory. >> >> > > > > > > >> >> > > > > > > On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S < >> >> > > nishanth.2...@gmail.com >> >> > > > > >> >> > > > > > > wrote: >> >> > > > > > > >> >> > > > > > > > Do you have replication turned on in hbase and if so is >> >> your >> >> > > slave >> >> > > > > > > > consuming the replicated data?. >> >> > > > > > > > >> >> > > > > > > > -Nishanth >> >> > > > > > > > >> >> > > > > > > > On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti < >> >> > > > > > > > mpiffare...@powerspace.com> wrote: >> >> > > > > > > > >> >> > > > > > > > > Hi all, >> >> > > > > > > > > >> >> > > > > > > > > We are running out of space in our small hadoop cluster >> >> so I >> >> > > was >> >> > > > > > > checking >> >> > > > > > > > > disk usage on HDFS and I saw that most of the space was >> >> > > occupied >> >> > > > by >> >> > > > > > > the* >> >> > > > > > > > > /hbase/oldWALs* folder. >> >> > > > > > > > > >> >> > > > > > > > > I have checked in the "HBase Definitive Book" and >> others >> >> > books, >> >> > > > > > > web-site >> >> > > > > > > > > and I have also search my issue on google but I didn't >> >> find a >> >> > > > > proper >> >> > > > > > > > > response... >> >> > > > > > > > > >> >> > > > > > > > > So I would like to know what does this folder, what is >> use >> >> > for >> >> > > > and >> >> > > > > > also >> >> > > > > > > > how >> >> > > > > > > > > can I free space from this folder without breaking >> >> > > everything... >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > If it's related to a specific version... our cluster is >> >> under >> >> > > > > > > > > 5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6). >> >> > > > > > > > > >> >> > > > > > > > > Thx for your help! >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> > >> > >> > > > > -- > Sean > -- Sean