Jan Van Besien created HBASE-29774:
--------------------------------------

             Summary: incremental backup fails on empty WAL files
                 Key: HBASE-29774
                 URL: https://issues.apache.org/jira/browse/HBASE-29774
             Project: HBase
          Issue Type: Bug
          Components: backup&restore
    Affects Versions: 2.6.4
            Reporter: Jan Van Besien


Incremental backup fails during 
{{IncrementalTableBackupClient.convertWALsToHFiles}} when one of these WAL 
files is empty (zero bytes). 

The map tasks fail like this:

{code:java}
2025-12-12 06:22:21,785 INFO [main] 
org.apache.hadoop.hbase.mapreduce.WALInputFormat: Opening 
hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.qa03-shared.svc.cluster.local%2C16020%2C1764740248035.hbase-region-0.hbase-region.qa03-shared.svc.cluster.local%2C16020%2C1764740248035.regiongroup-0.1764754214409
 for 
hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.qa03-shared.svc.cluster.local%2C16020%2C1764740248035.hbase-region-0.hbase-region.qa03-shared.svc.cluster.local%2C16020%2C1764740248035.regiongroup-0.1764754214409
 (-9223372036854775808:9223372036854775807) length:0
2025-12-12 06:22:21,810 INFO [main] 
org.apache.hadoop.hbase.mapreduce.WALInputFormat: Closing reader
2025-12-12 06:22:21,811 INFO [main] org.apache.hadoop.mapred.MapTask: Starting 
flush of map output
2025-12-12 06:22:21,815 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new compressor [.deflate]
2025-12-12 06:22:21,904 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.hbase.regionserver.wal.WALHeaderEOFException: EOF while 
reading PB WAL magic
        at 
org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufWALReader.readHeader(AbstractProtobufWALReader.java:221)
        at 
org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufWALReader.init(AbstractProtobufWALReader.java:147)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createStreamReader(WALFactory.java:360)
        at 
org.apache.hadoop.hbase.wal.WALFactory.createStreamReader(WALFactory.java:481)
        at 
org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:162)
        at 
org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:204)
        at 
org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.initialize(WALInputFormat.java:197)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:561)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:525)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.io.EOFException
        at java.base/java.io.DataInputStream.readFully(DataInputStream.java:210)
        at java.base/java.io.DataInputStream.readFully(DataInputStream.java:179)
        at 
org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufWALReader.readHeader(AbstractProtobufWALReader.java:219)
        ... 14 more
{code}

The file mentioned in the above log snippet is indeed zero bytes.

The calling code fails like this:
{code}
2025-12-12 06:22:45,140 ERROR 
org.apache.hadoop.hbase.backup.impl.TableBackupClient: Unexpected exception in 
incremental-backup: incremental copy backup_1765519365442WAL Player failed
java.io.IOException: WAL Player failed
        at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.walToHFiles(IncrementalTableBackupClient.java:448)
        at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.convertWALsToHFiles(IncrementalTableBackupClient.java:414)
        at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:311)
        at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:594)
{code}

The javadoc in {{WALHeaderEOFException}} says "This usually means the WAL file 
just contains nothing and we are safe to skip over it.". So maybe that is 
indeed what needs to happen here?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to