munish1789 opened a new issue, #3949:
URL: https://github.com/apache/bookkeeper/issues/3949
**BUG REPORT**
***Describe the bug***
One broken bookie out of three replicas due to corrupted journal stored for
recovery.
***To Reproduce***
Steps to reproduce the behavior:
This was observed during a longevity run over more then 48 hours with some
load. 1 instance of bookie was not able to recover during pod restart stating
posibbly becasue of corrupted bk journal file Opening journal
"/bk/journal/j0/current/18795c6b58c.txn"
In the stack trace below it tried to replay ledger 0 from some negative
position.
`2023-05-01T00:28:52,129 - INFO - [main:ComponentStarter@84] - Starting
component bookie-server.
2023-05-01T00:28:52,132 - INFO - [main:Bookie@995] - Replaying journal
1681845040524 from position 1329111040
2023-05-01T00:28:52,134 - INFO - [main:JournalChannel@157] - Opening
journal /bk/journal/j0/current/18795c6b58c.txn
2023-05-01T00:28:52,142 - INFO - [main:EntryLogManagerBase@144] - Creating
a new entry log file for ledger '111680' : diskFull = false, allDisksFull =
false, reachEntryLogLimit = false, logChannel = null
2023-05-01T00:28:52,156 - INFO - [main:EntryLoggerAllocator@181] - Created
new entry log file /bk/ledgers/l3/current/e5d5.log for logId 58837.
2023-05-01T00:28:52,163 - INFO - [main:EntryLogManagerBase@144] - Creating
a new entry log file for ledger '109556' : diskFull = false, allDisksFull =
false, reachEntryLogLimit = false, logChannel = null
2023-05-01T00:28:52,168 - INFO - [pool-5-thread-1:EntryLoggerAllocator@181]
- Created new entry log file /bk/ledgers/l3/current/e5d6.log for logId 58838.
2023-05-01T00:28:52,171 - INFO - [pool-5-thread-1:EntryLoggerAllocator@181]
- Created new entry log file /bk/ledgers/l2/current/e5d7.log for logId 58839.
2023-05-01T00:28:52,174 - INFO - [main:EntryLogManagerBase@144] - Creating
a new entry log file for ledger '109536' : diskFull = false, allDisksFull =
false, reachEntryLogLimit = false, logChannel = null
2023-05-01T00:28:52,176 - INFO - [pool-5-thread-1:EntryLoggerAllocator@181]
- Created new entry log file /bk/ledgers/l2/current/e5d8.log for logId 58840.
2023-05-01T00:28:52,177 - INFO - [main:EntryLogManagerBase@144] - Creating
a new entry log file for ledger '111708' : diskFull = false, allDisksFull =
false, reachEntryLogLimit = false, logChannel = null
2023-05-01T00:28:52,179 - INFO - [pool-5-thread-1:EntryLoggerAllocator@181]
- Created new entry log file /bk/ledgers/l0/current/e5d9.log for logId 58841.
2023-05-01T00:28:52,394 - INFO - [main:EntryLogManagerBase@144] - Creating
a new entry log file for ledger '0' : diskFull = false, allDisksFull = false,
reachEntryLogLimit = false, logChannel = null
2023-05-01T00:28:52,396 - INFO - [pool-5-thread-1:EntryLoggerAllocator@181]
- Created new entry log file /bk/ledgers/l0/current/e5da.log for logId 58842.
2023-05-01T00:28:52,395 - ERROR - [main:LedgerEntryPage@202] -
IllegalArgumentException when trying to read ledger 0 from position
-541506176668401664
java.lang.IllegalArgumentException: Negative position
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:785) ~[?:?]
at
org.apache.bookkeeper.bookie.FileInfo.readAbsolute(FileInfo.java:426)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.FileInfo.read(FileInfo.java:396)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerEntryPage.readPage(LedgerEntryPage.java:196)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexPersistenceMgr.updatePage(IndexPersistenceMgr.java:646)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.grabLedgerEntryPage(IndexInMemPageMgr.java:447)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.getLedgerEntryPage(IndexInMemPageMgr.java:412)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.putEntryOffset(IndexInMemPageMgr.java:571)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:108)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:539)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:521)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.addEntry(InterleavedLedgerStorage.java:375)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerDescriptorImpl.addEntry(LedgerDescriptorImpl.java:155)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie$6.process(Bookie.java:949)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Journal.scanJournal(Journal.java:840)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.replay(Bookie.java:996)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:962)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:1016)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:156)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.server.service.BookieService.doStart(BookieService.java:68)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$4(LifecycleComponentStack.java:144)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
com.google.common.collect.ImmutableList.forEach(ImmutableList.java:405)
~[com.google.guava-guava-30.0-jre.jar:?]
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:144)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:85)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.server.Main.doMain(Main.java:234)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.server.Main.main(Main.java:208)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
2023-05-01T00:28:52,410 - ERROR - [main:AbstractLifecycleComponent@85] -
Failed to start Component: bookie-server
java.lang.IllegalArgumentException: Negative position
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:785) ~[?:?]
at
org.apache.bookkeeper.bookie.FileInfo.readAbsolute(FileInfo.java:426)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.FileInfo.read(FileInfo.java:396)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerEntryPage.readPage(LedgerEntryPage.java:196)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexPersistenceMgr.updatePage(IndexPersistenceMgr.java:646)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.grabLedgerEntryPage(IndexInMemPageMgr.java:447)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.getLedgerEntryPage(IndexInMemPageMgr.java:412)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.putEntryOffset(IndexInMemPageMgr.java:571)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:108)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:539)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:521)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.addEntry(InterleavedLedgerStorage.java:375)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.bookie.LedgerDescriptorImpl.addEntry(LedgerDescriptorImpl.java:155)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie$6.process(Bookie.java:949)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Journal.scanJournal(Journal.java:840)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.replay(Bookie.java:996)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:962)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:1016)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:156)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.server.service.BookieService.doStart(BookieService.java:68)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$4(LifecycleComponentStack.java:144)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
com.google.common.collect.ImmutableList.forEach(ImmutableList.java:405)
~[com.google.guava-guava-30.0-jre.jar:?]
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:144)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at
org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:85)
~[org.apache.bookkeeper-bookkeeper-common-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.server.Main.doMain(Main.java:234)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
at org.apache.bookkeeper.server.Main.main(Main.java:208)
~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-437.jar:4.14.3-build-437]
2023-05-01T00:28:52,411 - ERROR - [main:AbstractLifecycleComponent@87] -
Calling uncaughtExceptionHandler
2023-05-01T00:28:52,411 - ERROR - [main:ComponentStarter@76] - Triggered
exceptionHandler of Component: bookie-server because of Exception in Thread:
Thread[main,5,main]`
***Expected behavior***
A clear and concise description of what you expected to happen.
***Screenshots***
NAME READY STATUS
RESTARTS AGE
nautilus-bookie-0 0/1 Running 1306
(32h ago) 13d
If applicable, add screenshots to help explain your problem.
***Additional context***
Bookkeeper version used is 4.14.3
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]