Everything is as you said ( 10.0.100.50 is running also master) except for the fact that daughter still was holding some reference...
http://pastebin.com/m57e2a14 here is master log http://pastebin.com/m3f4cc091 here is region server from dev40 (same host as master) http://pastebin.com/d6a41d845 and this is second region server dev41 10.0.100.51 NN log of course confirms what master did - delete of region dir @ 15:33:35,902. Surprising fact is that we i restarted hbase everything started to work again. Before restart scan on that table gave me 59 results and after few K. Maybe its https://issues.apache.org/jira/browse/HBASE-1894 or similar? Thanks, Michal 2010/2/5 Stack <[email protected]> > 2010/2/4 Michał Podsiadłowski <[email protected]>: > >> $ ./bin/hadoop fs -get > >> /hbase/filmContributors/1670715971/content/3783592739034234831 . > >> > >> I did this, i've checked if there is this file from web UI and the whole > > dir was missing. > > No wounder because it was deleted after split by node that was hosting > > partent region before split. > > > > 2010-02-03 15:33:35,902 INFO org.apache.hadoop.hdfs.server. > > namenode.FSNamesystem.audit: ugi=mpodsiadlowski,devel, [some privileges] > > ip=/10.0.100.50 cmd=delete > src=/hbase/filmContributors/1670715971 > > dst=null perm=null > > > > So the order of events was sth like this > > > > 15:32:35, - split of region hosted by 10.0.100.50 > > 15:32:37 one of the new regions assigned to 10.0.100.51 > > 15.33.35 10.0.100.50 removes whole dir with the file that is causing > > problems > > This doesn't seem right. Its the maser that does the delete of the > parent regions. It does this after it determines that the daughters > no longer have reference to parent region. Running in DEBUG and > pasting the master log would help debugging this (Looking in NN log to > see main events regards files and dirs as you did above is another > good means for figuring out stuff). > > St.Ack >
