Somehow my hbase instance has got in an unstable state that is
unstartable. Whereas previously logs files would look like this:
2008-02-04 00:03:53,841 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 00:03:53,853 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 00:04:03,784 INFO org.....HMaster: HMaster.metaScanner
scanning meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 00:04:03,809 INFO org.....HMaster: HMaster.metaScanner scan
of meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 00:04:03,810 INFO org.....HMaster: all meta regions scanned
2008-02-04 00:04:53,847 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 00:04:53,858 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 00:05:03,787 INFO org.....HMaster: HMaster.metaScanner
scanning meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 00:05:03,817 INFO org.....HMaster: HMaster.metaScanner scan
of meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 00:05:03,817 INFO org.....HMaster: all meta regions scanned
now they look like this:
2008-02-04 22:25:15,516 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 22:25:15,657 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 22:25:15,657 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 22:25:15,688 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 22:26:15,663 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 22:26:15,717 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
2008-02-04 22:27:59,213 INFO org.....HMaster: HMaster.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020}
2008-02-04 22:27:59,382 INFO org.....HMaster: HMaster.rootScanner scan
of meta region {regionname: -ROOT-,,0, startKey: <>, server:
127.0.0.2:60020} complete
There is no "all meta regions scanned" message. Clients including the
web monitor at port 60010 and the hbase shell are unable to connect.
Based on some brief debugging, It seems that it thinks it is unable to
"scan root" (HMaster.scanRoot never returns true). No amount of
restarting would fix this.
I was able to restore from a complete backup of my hadoop folder, so I
do not need any help. If someone would like to take my logs and data to
attempt to reproduce this potential bug, you are welcome to. Logs are
183K zipped, and data is 639M zipped.
- Marc