Just to follow up on Radai's idea of pushing the buffering logic to the
broker. It may be possible to do this efficiently if we assume aborted
transactions are rare. The following is a draft proposal. For each
partition, the broker maintains the last stable offset (LSO) as described
in the document, and only exposes messages up to this point if the reader
is in the read-committed mode. When a new stable offset (NSO) is
determined, if there is no aborted message in this window, the broker
simply advances the LSO to the NSO. If there is at least one aborted
message, the broker first replaces the current log segment with new log
segments excluding the aborted messages and then advances the LSO. To make
the replacement efficient, we can replace the current log segment with 3
new segments: (1) a new "shadow" log segment that simply references the
portion of the current log segment from the beginning to the LSO, (2) a log
segment created by copying only committed messages between the LSO and the
NSO, (3) a new "shadow" log segment that references the portion of the
current log segment from the NSO (open ended). Note that only (2) involves
real data copying. If aborted transactions are rare, this overhead will be
insignificant. Assuming that applications typically don't abort
transactions, transactions will only be aborted by transaction coordinators
during hard failure of the producers, which should be rare.

This way, the consumer library's logic will be simplified. We can still
expose uncommitted messages to readers in the read-uncommitted mode and
therefore leave the door open for speculative reader in the future.

Thanks,

Jun


On Wed, Dec 21, 2016 at 10:44 AM, Apurva Mehta <apu...@confluent.io> wrote:

> Hi Joel,
>
> The alternatives are embedded in the 'discussion' sections which are spread
> throughout the google doc.
>
> Admittedly, we have not covered high level alternatives like those which
> have been brought up in this thread. In particular, having a separate log
> for transactional mesages and also having multiple producers participate in
> a single transaction.
>
> This is an omission which we will correct.
>
> Thanks,
> Apurva
>
> On Wed, Dec 21, 2016 at 10:34 AM, Joel Koshy <jjkosh...@gmail.com> wrote:
>
> > >
> > >
> > > @Joel,
> > >
> > > I read over your wiki, and apart from the introduction of the notion of
> > > journal partitions --whose pros and cons are already being discussed--
> > you
> > > also introduce the notion of a 'producer group' which enables multiple
> > > producers to participate in a single transaction. This is completely
> > > opposite of the model in the KIP where a transaction is defined by a
> > > producer id, and hence there is a 1-1 mapping between producers and
> > > transactions. Further, each producer can have exactly one in-flight
> > > transaction at a time in the KIP.
> > >
> >
> > Hi Apurva - yes I did notice those differences among other things :)
> BTW, I
> > haven't yet gone through the google-doc carefully but on a skim it does
> not
> > seem to contain any rejected alternatives as the wiki states.
> >
>

Reply via email to