Viraj Jasani created HBASE-27398: ------------------------------------ Summary: Remove dumping of EOFException while reading WAL with ProtobufLogReader Key: HBASE-27398 URL: https://issues.apache.org/jira/browse/HBASE-27398 Project: HBase Issue Type: Improvement Reporter: Viraj Jasani Assignee: Viraj Jasani Fix For: 2.6.0, 2.5.1, 3.0.0-alpha-4, 2.4.15
The log processing tooling that helps extract and analyze Exceptions from regionserver logs can read EOFException while reseting seeking to original position as part of ProtobufLogReader implementation. Common logs: {code:java} 2022-09-28 17:02:00,288 DEBUG [20%2C1664323516467,1] wal.ProtobufLogReader - Encountered a malformed edit, seeking back to last good position in file, from 187159 to 187158 java.io.EOFException: Partial PB while reading WAL, probably an unexpected EOF, ignoring. current offset=187159 at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:390) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:104) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:92) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:258) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.tryAdvanceStreamAndCreateWALBatch(ReplicationSourceWALReader.java:251) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:148) {code} {code:java} 2022-09-28 11:02:10,792 DEBUG [20%2C1664323193648,1] wal.ProtobufLogReader - Encountered a malformed edit, seeking back to last good position in file, from 112026775 to 112026303 java.io.EOFException: EOF while reading 296 WAL KVs; started reading at 112026367 and read up to 112026775 at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:418) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:104) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:92) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:258) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172) at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:222) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:157) Caused by: java.io.EOFException: Only read 6 at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:406) ... 7 more {code} After looking at these logs, it seems that having EOFException even at DEBUG level is not helping much because reseting seek to a different position is going to be expected often. We should remove dumping EOFException in these cases. -- This message was sent by Atlassian Jira (v8.20.10#820010)