[hbase] Stuck replaying the edits of crashed machine
----------------------------------------------------

                 Key: HADOOP-2613
                 URL: https://issues.apache.org/jira/browse/HADOOP-2613
             Project: Hadoop
          Issue Type: Bug
            Reporter: stack


Rapleaf master got stuck trying to replay the logs of the server holding the 
.META. region.   Here are pertinent log excerpts:

{code}
2008-01-12 02:17:42,621 DEBUG org.apache.hadoop.hbase.HLog: Creating new log 
file writer for path /data/hbase1/hregion_1679905157/oldlogfile.log; map 
content {spider_pages,25_530417241,[EMAIL PROTECTED], 
spider_pages,6_74488371,1200029312876=org.apache.had
[EMAIL PROTECTED], spider_pages,2_561473281,[EMAIL PROTECTED], .META.,,[EMAIL 
PROTECTED], 
spider_pages,5_544278041,1199025825074=org.apache.hadoop.io.SequenceFile$RecordCompress
[EMAIL PROTECTED], spider_pages,49_567090611,[EMAIL PROTECTED], 
spider_pages,5_566039401,[EMAIL PROTECTED], 
spider_pages,59_360738971,1200073647952=org.apache.hadoop.io.SequenceFile$RecordCompressWr
[EMAIL PROTECTED], spider_pages,59_302628011,[EMAIL PROTECTED]
2008-01-12 02:17:44,124 DEBUG org.apache.hadoop.hbase.HLog: Applied 20000 edits
2008-01-12 02:17:49,076 DEBUG org.apache.hadoop.hbase.HLog: Applied 30000 edits
2008-01-12 02:17:49,078 DEBUG org.apache.hadoop.hbase.HLog: Applied 30003 total 
edits
2008-01-12 02:17:49,078 DEBUG org.apache.hadoop.hbase.HLog: Splitting 1 of 2: 
hdfs://tf1:7276/data/hbase1/log_XX.XX.XX.32_1200011947645_60020/hlog.dat.003
2008-01-12 02:17:52,574 DEBUG org.apache.hadoop.hbase.HLog: Applied 10000 edits
2008-01-12 02:17:59,822 WARN org.apache.hadoop.hbase.HMaster: Processing 
pending operations: ProcessServerShutdown of XX.XX.XX.32:60020
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at 
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
        at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1763)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1663)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1709)
        at org.apache.hadoop.hbase.HLog.splitLog(HLog.java:168)
        at 
org.apache.hadoop.hbase.HMaster$ProcessServerShutdown.process(HMaster.java:2144)
        at org.apache.hadoop.hbase.HMaster.run(HMaster.java:1056)
2008-01-12 02:17:59,822 DEBUG org.apache.hadoop.hbase.HMaster: Main processing 
loop: ProcessServerShutdown of XX.XX.XX.32:60020
{code}

It keeps doing the above over and over again.

I suppose we could skip bad logs... or just shut down master w/ a reason why.

Odd is that we seem to be well into the file -- we've run over 10000 edits... 
before we trip over the EOF.

I've asked for an fsck to see what that says.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to