[
https://issues.apache.org/jira/browse/BOOKKEEPER-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15618319#comment-15618319
]
Venkateswararao Jujjuri (JV) commented on BOOKKEEPER-934:
---------------------------------------------------------
I gave more thought on this. Good path is pretty easy and straightforward.
I believe error path has multiple things we need to worry about.
Scenario-1:
#1 - Non durable write, L1-E1 goes to bookies b1, b2, b3.
#2 - Some failover event happens and now new ensemble is B3, b4, b5.
#3 - Durable write, L1-E2 comes in writes to B3, B4, B5 and persists the write.
#4 - Now we have only 1 copy of L1-E1. and stay like that until we detect and
rereplicate it.
To make things worse:
In #2 -> All 3 bookies, b1, b2, b3 goes down and completely new ensemble comes
up. b4, b5, b6.
In this scenario, We persist L1-E2 and assume everything is fine.
This scenario becomes a silent data loss issue. With current code, it is just a
data availability issue.
To make this feature work:
We need to make sure that every sync write is more like a recovery operation.
i.e make sure that *all* previous
entries are persisted before persisting current write and ack back. This
becomes extremely expensive.
[~zhaijia] [~ayegorov]
> Relax durability
> ----------------
>
> Key: BOOKKEEPER-934
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-934
> Project: Bookkeeper
> Issue Type: Improvement
> Reporter: Jia Zhai
> Assignee: Jia Zhai
>
> I am thinking adding a new flag to bookkeeper#addEntry(..., Boolean sync). So
> the application can control whether to sync or not for individual entries.
> - On the write protocol, adding a flag to indicate whether this write should
> sync to disk or not.
> - On the bookie side, if the addEntry request is sync, going through original
> pipeline. If the addEntry disables sync, complete the add callbacks after
> writing to the journal file and before flushing journal.
> - Those add entries (disabled syncs) will be flushed to disks with subsequent
> sync add entries.
> There is already a discussion in mail thread, here this ticket could gather
> ideas, and provide the discussion materials
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)