Andrew Purtell created HBASE-12830:
--------------------------------------

             Summary: Unreadable HLogs stuck in replication queues
                 Key: HBASE-12830
                 URL: https://issues.apache.org/jira/browse/HBASE-12830
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.98.9
            Reporter: Andrew Purtell


We had an incident where underlying infrastructure issues caused HDFS namenodes 
to go down, not at the same time, leading to periods of HDFS service outage and 
recovery as namenodes failed over. These clusters had replication enabled. We 
had some Regionservers roll logs during partial HDFS availability. Namespace 
entries for these HLogs were created but the first block-being-written was lost 
or unable to complete, leading to dead 0-length HLogs in the queues of active 
sources. When this happens the queue becomes stuck on the dead 0-length HLog 
reporting EOFExceptions whenever the source wakes up and tries to (re)open the 
current file like so:

{noformat}
2015-01-08 18:50:40,956 WARN 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 1-<hostname 
removed>,60020,1418764167084 Got: 
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:197)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1845)
        at 
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1810)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1759)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:70)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:175)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:184)
        at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:70)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:128)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:91)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:79)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:68)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:506)
        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:309)
{noformat}

This exception originates from where SequenceFile tries to read in the 4-byte 
version header from position 0.

In ReplicationSource#run we have an active loop:

{code}
    // Loop until we close down
    while (isActive()) {
    ...
    }
{code}

Within this loop we iterate over paths in the replication queue. For each path, 
we attempt to open it:

{code}
      // Open a reader on it
      if (!openReader(sleepMultiplier)) {
        // Reset the sleep multiplier, else it'd be reused for the next file
        sleepMultiplier = 1;
        continue;
      }
{code}

When we have a zero length file openReader returns TRUE but this.reader is set 
to NULL (look at the catch of the outer try block) and we fall through the 
conditional to:

{code}
      // If we got a null reader but didn't continue, then sleep and continue
      if (this.reader == null) {
        if (sleepForRetries("Unable to open a reader", sleepMultiplier)) {
          sleepMultiplier++;
        }
        continue;
      }
{code}

We will keep trying to open the current file for a long time. The queue will be 
stuck until sleepMultiplier == maxRetriesMultiplier (conf 
"replication.source.maxretriesmultiplier", default 10), with a base sleep time 
of "replication.source.sleepforretries" (default 1000) ms, then we will call 
ReplicationSource#processEndOfFile(). 

By default we will spin on opening the dead 0-length HLog for (1000*1) + 
(1000*2) ... + (1000*10) milliseconds before processing the file out of the 
queue. In HBASE-11964 we recommend increasing 
"replication.source.maxretriesmultiplier" to 300. Using the updated 
configuration we will wait for (1000*1) + (1000*2) ... + (1000*300) 
milliseconds before processing the file out of the queue. 

There should be some way to break out of this very long wait for a 0-length or 
corrupt file that is blocking the queue. Perhaps we can check if there are 
later files in the queue and not wait for maxRetriesMultipler in that case. 
Some different threshold could be set for that. Or, now we can't short circuit 
this manually by removing the 0-length log files, but we could enable this. 
Currently a FNFE triggers an attempt to find where the log has gone. If it 
can't be found, we fall through to

{code}
          // TODO What happens the log is missing in both places?
{code}

and leave openReader() returning TRUE as above, the sleep multiplier is 
increased, and we continue to spin on the file with increasing wait times.

(And having that TODO unimplemented is also not good.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to