Best way to compact a region after a move?

2012-12-30 Thread Jean-Marc Spaggiari
Hi, When I'm balancing manually the regions on my cluster, and I want to make sure they are local, so I want to major_compact them each time I'm moving them. On the balanceCluster method, we are returning a list of region to move. Which mean they are not yet moved, so I can't compact them there.

CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
Hi, I have a IOException /hbase/.archive/table_name is non empty exception every minute on my logs. There is 30 directories under this directory. the main directory is from yesterday, but all sub directories are from December 10th, all the same time. What does this .archive directory is used

Re: Best way to compact a region after a move?

2012-12-30 Thread Ted Yu
balancerCluster() executes on master. Compaction is region server activity. So they don't pair naturally. I answered first part of the question in the thread titled 'How to know it's time for a major compaction?': In RegionObserver, we already have the following hook: /** * Called after

Re: CleanerChore exception

2012-12-30 Thread Ted Yu
Looks like you're using 0.94.3 The archiver is backport of: HBASE-5547, Don't delete HFiles in backup mode Can you provide more the log where the IOE was reported using pastebin ? Thanks On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, I have a

Re: Best way to compact a region after a move?

2012-12-30 Thread Jean-Marc Spaggiari
Hi Ted, Thanks for your reply. I looked at the RegionObserver and I will dig this way. I think I found what I need in it. How can I attach it to HBase? Should I do that on all the servers? On the master only and it will replicate? Should I attached it to each regions? Or directly to the table?

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I might wait a bit in case some testing is required with my version. Is this what you are looking for? http://pastebin.com/N8Q0FMba I will keep the files for now since it seems it's not causing any major issue. That will allow some

Re: Best way to compact a region after a move?

2012-12-30 Thread Ted Yu
You can find how to dynamically load coprocessor in hbase-server/src/main/ruby/shell/commands/alter.rb There're ample test cases which show you how to use RegionObserver, e.g. src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java Yes, you can attach your coprocessor

Re: CleanerChore exception

2012-12-30 Thread Ted Yu
The exception came from this line: if (file.isDir()) checkAndDeleteDirectory(file.getPath()); Looking at checkAndDeleteDirectory(), it recursively deletes files and directories under the specified path. Does /hbase/.archive/entry_duplicate only contain empty directories underneath it ?

Re: Best way to compact a region after a move?

2012-12-30 Thread Jean-Marc Spaggiari
Thanks for the hints. I will look there too. Is there a way to attach id to ALL the tables and not specificly some tables? Or should I attached it to the tables one by one? 2012/12/30, Ted Yu yuzhih...@gmail.com: You can find how to dynamically load coprocessor in

Re: Best way to compact a region after a move?

2012-12-30 Thread Ted Yu
I guess you would want custom compaction only on user tables. Take a look at the following config param in http://hbase.apache.org/book.html: hbase.coprocessor.region.classesCheers On Sun, Dec 30, 2012 at 10:25 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Thanks for the hints. I will

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
Regargind the logcleaner settings, I have not changed anything. It's what came with the initial install. So I don't have anything setup for this plugin in my configuration files. For the files on the FS, here is what I have: hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls

Re: Best way to compact a region after a move?

2012-12-30 Thread Jean-Marc Spaggiari
Exactly what I was looking for ;) Thanks a lot! JM 2012/12/30, Ted Yu yuzhih...@gmail.com: I guess you would want custom compaction only on user tables. Take a look at the following config param in http://hbase.apache.org/book.html: hbase.coprocessor.region.classesCheers On Sun, Dec 30,

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
So. Looking deeper I found few things. First, why checkAndDeleteDirectory is not simply calling FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? Also, FSUtils.listStatus(fs, toCheck, null); will return null if there is no status. Not just an empty array. And it's returning

Re: CleanerChore exception

2012-12-30 Thread Ted Yu
Thanks for the digging. This concurs with my suspicion in the beginning. I am copying Jesse who wrote the code. He should have more insight on this. After his confirmation, you can log a JIRA. Cheers On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: So.

Re: CleanerChore exception

2012-12-30 Thread Ted Yu
Looking at this line in checkAndDeleteDirectory(): return canDeleteThis ? fs.delete(toCheck, false) : false; If fs.delete() returns false, meaning the deletion was unsuccessful, the parent directory tree wouldn't be deleted. I think this is inconsistent with the javadoc for

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
Thanks for the confirmation. Also, seems that there is no test class related to checkAndDeleteDirectory. It might be good to add that too. I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all identical for this methode. I will try to do some modifications and see the results... So

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
The Javadoc is saying: @return tttrue/tt if the directory was deleted, ttfalse/tt otherwise So I think the line return canDeleteThis ? fs.delete(toCheck, false) : false; is still correct. It's retuning false if the directory has not been deleted. There is no exception here. If the TTL for a

Re: CleanerChore exception

2012-12-30 Thread Ted
Thanks for your digging. Minor optimization would be to issue delete() on the parent directory so that there are fewer requests to namenode. Cheers On Dec 30, 2012, at 2:15 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: I did the change, pushed it and it cleaned my directories

Re: CleanerChore exception

2012-12-30 Thread Jean-Marc Spaggiari
I'm not sure I'm getting that. It's recursive. So when you are on the parent directory, you don't know yet if the child directory is empty or not. So you can't call the delete() yet. If you call the delet() giving true for recurs, then you might delete some files who just got created, which we

Re: CleanerChore exception

2012-12-30 Thread Jesse Yates
Hey, So the point of all the delete code in the cleaner is to try and delete each of the files in the directory and then delete the directory, assuming its empty- it shouldn't leak the IOException if it the directory is found to be empty and then gets a file added. This is really odd though, as