> In flushing
the write cache, it will trigger a checkpoint to mark the journal??s
lastMark position (100MB??s offset)


When flushing, only the lastMark value will be persisted to the file, but the 
lastMark value will not be updated.
The lastMark value is updated only when the ForceWriteRequest completes. So 
when the flush is triggered here, the position of lastMark is not 100MB's offset


I??m not sure whether I missed some logic.





------------------ ???????? ------------------
??????: "Hang Chen"<chenh...@apache.org&gt;; 
????????: 2022??5??30??(??????) ????9:21
??????: "dev"<dev@bookkeeper.apache.org&gt;; 
????: [Discuss] Bookie may lose data even though we turn on fsync for the 
journal



We found one place where the bookie may lose data even though we turn
on fsync for the journal.
Condition:
- One journal disk, and turn on fsync for the journal
- Configure two ledger disks, ledger1, and ledger2

Assume we write 100MB data into one bookie, 70MB data written into
ledger1's write cache, and 30 MB data written into ledger2's write
cache. Ledger1's write cache is full and triggers flush. In flushing
the write cache, it will trigger a checkpoint to mark the journal??s
lastMark position (100MB??s offset) and write the lastMark position
into both ledger1 and ledger2's lastMark file.

At this time, this bookie shutdown without flush write cache, such as
shutdown by `kill -9` command, and ledger2's write cache (30MB)
doesn??t flush into ledger disk. But ledger2's lastMark position which
persisted into lastMark file has been updated to 100MB??s offset.

When the bookie starts up, the journal reply position will be
`min(ledger1's lastMark, ledger2's lastMark)`, and it will be 100MB??s
offset. The ledger2's 30MB data won??t reply and that data will be
lost.

Please help take a look.&nbsp; I??m not sure whether I missed some logic.

Thanks,
Hang

Reply via email to