Thanks for the response, Jim. I've tried this and I'm getting the same behavior. Anything else I should be looking for?
I also found empty files in the compaction directory which I deleted - hope that was ok. On Wed, May 21, 2008 at 9:29 AM, Jim Kellerman <[EMAIL PROTECTED]> wrote: > What is happening is that the region server is trying to open the meta, and > failing because there is an empty file somewhere in the region or in the > recovery log. I would advise the following: > > 1. shut down HBase (as cleanly as possible) > 2. find and delete any zero length files > 3. there may be empty, but not zero length, MapFiles. MapFiles are > stored in a directory that looks like: > <hbase-root>/<tablename>/<a number>/<columnname>/mapfiles/<a long > number>/ > > In this directory there are two files: 'data' and 'index'. An empty > MapFile (one that has been opened and closed with no data written to > it) 'data' will have size 110 (bytes) and 'info' will have size 137. > > If you find one of these, you should delete the directory that contains > them (in the example above <a long number>). > > > --- > Jim Kellerman, Senior Engineer; Powerset > > > > -----Original Message----- > > From: Daniel Leffel [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, May 21, 2008 8:13 AM > > To: [email protected] > > Subject: Re: Problem caused by HBASE-617 - Blocks missing > > from hlog.dat > > > > After letting the hbase attempt to assign regions all night > > long, I awoke to an unassigned meta region. Here is the master log: > > > > Desc: {name: .META., families: {info:={name: info, max versions: 1, > > compression: NONE, in memory: false, max length: 2147483647, > > bloom filter: > > none}}} > > 2008-05-21 11:10:29,560 DEBUG > > org.apache.hadoop.hbase.HMaster: Main processing loop: > > ProcessRegionClose of .META.,,1, true > > 2008-05-21 11:10:29,560 INFO org.apache.hadoop.hbase.HMaster: > > region closed: > > .META.,,1 > > 2008-05-21 11:10:29,560 INFO org.apache.hadoop.hbase.HMaster: reassign > > region: .META.,,1 > > 2008-05-21 11:10:29,729 INFO org.apache.hadoop.hbase.HMaster: > > assigning region .META.,,1 to server 10.252.242.159:60020 > > 2008-05-21 11:10:32,740 DEBUG > > org.apache.hadoop.hbase.HMaster: Received > > MSG_REPORT_PROCESS_OPEN : .META.,,1 from 10.252.242.159:60020 > > 2008-05-21 11:10:32,740 DEBUG > > org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_CLOSE : > > .META.,,1 from 10.252.242.159:60020 > > 2008-05-21 11:10:32,740 INFO org.apache.hadoop.hbase.HMaster: > > 10.252.242.159:60020 no longer serving regionname: .META.,,1, > > startKey: <>, > > endKey: <>, encodedName: 1028785192, tableDesc: {name: > > .META., families: > > {info:={name: info, max versions: 1, compression: NONE, in > > memory: false, max length: 2147483647, bloom filter: none}}} > > 2008-05-21 11:10:32,740 DEBUG > > org.apache.hadoop.hbase.HMaster: Main processing loop: > > ProcessRegionClose of .META.,,1, true > > 2008-05-21 11:10:32,741 INFO org.apache.hadoop.hbase.HMaster: > > region closed: > > .META.,,1 > > 2008-05-21 11:10:32,741 INFO org.apache.hadoop.hbase.HMaster: reassign > > region: .META.,,1 > > 2008-05-21 11:10:34,585 INFO org.apache.hadoop.hbase.HMaster: > > assigning region .META.,,1 to server 10.254.30.79:60020 > > 2008-05-21 11:10:37,595 DEBUG > > org.apache.hadoop.hbase.HMaster: Received > > MSG_REPORT_PROCESS_OPEN : .META.,,1 from 10.254.30.79:60020 > > 2008-05-21 11:10:37,596 DEBUG > > org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_CLOSE : > > .META.,,1 from 10.254.30.79:60020 > > 2008-05-21 11:10:37,596 INFO org.apache.hadoop.hbase.HMaster: > > 10.254.30.79:60020 no longer serving regionname: .META.,,1, > > startKey: <>, > > endKey: <>, encodedName: 1028785192, tableDesc: {name: > > .META., families: > > {info:={name: info, max versions: 1, compression: NONE, in > > memory: false, max length: 2147483647, bloom filter: none}}} > > 2008-05-21 11:10:37,596 DEBUG > > org.apache.hadoop.hbase.HMaster: Main processing loop: > > ProcessRegionClose of .META.,,1, true > > > > > > and here is the regin server side: > > 2008-05-21 11:12:38,027 INFO org.apache.hadoop.hbase.HRegionServer: > > MSG_REGION_OPEN : .META.,,1 > > 2008-05-21 11:12:38,027 DEBUG > > org.apache.hadoop.hbase.HRegion: Opening region .META.,,1/1028785192 > > 2008-05-21 11:12:38,035 DEBUG org.apache.hadoop.hbase.HStore: > > loaded > > /hbase/.META./1028785192/info/info/2509658022189995817, > > isReference=false > > 2008-05-21 11:12:38,036 DEBUG org.apache.hadoop.hbase.HStore: > > loaded > > /hbase/.META./1028785192/info/info/8183182393002383771, > > isReference=false > > 2008-05-21 11:12:38,049 DEBUG org.apache.hadoop.hbase.HStore: Loaded 2 > > file(s) in hstore 1028785192/info, max sequence id 917793587 > > 2008-05-21 11:12:38,116 ERROR > > org.apache.hadoop.hbase.HRegionServer: error opening region > > .META.,,1 java.io.EOFException > > at java.io.DataInputStream.readFully(DataInputStream.java:180) > > at java.io.DataInputStream.readFully(DataInputStream.java:152) > > at > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1434) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.j > > ava:1411) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.j > > ava:1400) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.j > > ava:1395) > > at > > org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263) > > at > > org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242) > > at > > org.apache.hadoop.hbase.HStoreFile$HbaseMapFile$HbaseReader.<i > nit>(HStoreFile.java:554) > > at > > org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.< > init>(HStoreFile.java:609) > > at > > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:382) > > at org.apache.hadoop.hbase.HStore.<init>(HStore.java:849) > > at org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:431) > > at > > org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer > > .java:1258) > > at > > org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer > > .java:1204) > > at java.lang.Thread.run(Thread.java:619) > > > > > > > > On Tue, May 20, 2008 at 4:19 PM, stack <[EMAIL PROTECTED]> wrote: > > > > > Daniel Leffel wrote: > > > > > >> After experiencing a region server that would not exit > > (HBASE-617), I > > >> tried to bring back up hbase (after first having shut down and > > >> bringing back up DFS). > > >> > > >> There are around 370 regions. The first 250 were assigned > > to region > > >> servers within 5 minutes of startup. The rest of the > > regions took the > > >> better part of the day to become assigned to a region > > server. A quick > > >> inspection of the regionserver logs were showing messages like the > > >> following: > > >> > > >> 2008-05-20 18:33:46,964 DEBUG org.apache.hadoop.hbase.HMaster: > > >> Received MSG_REPORT_PROCESS_OPEN : > > categories,2864153,1211005494348 > > >> from 10.254.26.31:60020 > > >> > > >> > > > > > > These messages are sent over to the master by the regionserver as a > > > kind of ping saying "I'm still alive and working on > > whatever it was you gave me". > > > > > > Can you tell what was happening by looking in regionserver logs? > > > > > > Was it that all regions had been given to a single > > regionserver and it > > > was busy replaying edits before bringing the regions online > > (There is > > > a single worker thread per regionserver. If lots of edits > > to replay, > > > can take seconds to minutes to bring on a region). > > > > > > Did the regions come online gradually or all in a lump? > > > > > >> After waiting for all the regions to be assigned (and an > > absence of > > >> the above message appearing in the log), I started a MapReduce job > > >> that iterates over all regions. Immediately, the above mentioned > > >> region began to show up in the logs again with the above > > message and > > >> the job failed with an IOException because it couldn't > > locate blocks. > > >> > > >> I ran fsck on /hbase and sure enough, blocks are missing from the > > >> following file (although it reports a size of 0 as what's > > missing - I > > >> presume it just doesn't know): > > >> > > >> /hbase/log_10.254.30.79_1211300015031_60020/hlog.dat.000 > > >> > > >> > > > The above looks like the innocuous messages described in > > > https://issues.apache.org/jira/browse/HBASE-509. > > > > > > St.Ack > > > > > > > > > > No virus found in this incoming message. > > Checked by AVG. > > Version: 8.0.100 / Virus Database: 269.23.21/1456 - Release > > Date: 5/20/2008 6:45 AM > > > No virus found in this outgoing message. > Checked by AVG. > Version: 8.0.100 / Virus Database: 269.23.21/1456 - Release Date: 5/20/2008 > 6:45 AM >
