Hi, Jason,

Thanks for the updated doc. Looks good to me overall. Just a few more minor
comments.

210. Pid snapshots: Is the number of pid snapshot configurable or hardcoded
with 2? When do we decide to roll a new snapshot? Based on time, byte, or
offset? Is that configurable too?

211. I am wondering if we should store ExpirationTime in the producer
transactionalId mapping message as we do in the producer transaction status
message. If a producer only calls initTransactions(), but never publishes
any data, we still want to be able to expire and remove the producer
transactionalId mapping message.

212. The doc says "The LSO is always equal to one less than the minimum of
the initial offsets across all active transactions". This implies that LSO
is inclusive. However, currently, both high watermark and log end offsets
are exclusive. For consistency, it seems that we should make LSO exclusive
as well.

213. The doc says "If the topic is configured for compaction and deletion,
we will use the topic’s own retention limit. Otherwise, we will use the
default topic retention limit. Once the last message set produced by a
given PID has aged beyond the retention time, the PID will be expired." For
topics configured with just compaction, it seems it's more intuitive to
expire PID based on transactional.id.expiration.ms?

214. In the Coordinator-Broker request handling section, the doc says "If
the broker has a corresponding PID, verify that the received epoch is
greater than or equal to the current epoch. " Is the epoch the coordinator
epoch or the producer epoch?

215. The doc says "Append to the offset topic, but skip updating the offset
cache in the delayed produce callback, until a WriteTxnMarkerRequest from
the transaction coordinator is received including the offset topic
partitions." How do we do this efficiently? Do we need to cache pending
offsets per pid?

216. Do we need to add any new JMX metrics? For example, on the broker and
transaction coordinator side, it would be useful to know the number of live
pids.

Thanks,

Jun

On Wed, Feb 15, 2017 at 11:04 AM, Jason Gustafson <ja...@confluent.io>
wrote:

> Thanks everyone who has voted so far!
>
> Jun brought up a good point offline that the BeginTxnRequest was not
> strictly needed since there is no state to recover until a partition has
> been added to the transaction. Instead we can start the transaction
> implicitly upon receiving the first AddPartitionsToTxn request. This
> results in a slight change of behavior since the transaction timeout will
> be enforced only after the first send() instead of the beginTransaction().
> However, the main point of the timeout is to avoid blocking downstream
> consumers, which is only possible once you've added a partition to the
> transaction, so we feel the simplification is justified. I've updated the
> document accordingly.
>
> Thanks,
> Jason
>
> On Tue, Feb 14, 2017 at 2:03 PM, Jay Kreps <j...@confluent.io> wrote:
>
> > +1
> >
> > Super happy with how this turned out. It's been a long journey since we
> > started thinking about this 3+ years ago. Can't wait to see it in
> > code---this is a big one! :-)
> >
> > -Jay
> >
> > On Wed, Feb 1, 2017 at 8:13 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > We would like to start the voting process for KIP-98. The KIP can be
> > found
> > > at
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 98+-+Exactly+Once+Delivery+and+Transactional+Messaging
> > >
> > > Discussion thread can be found here:
> > >
> > > http://search-hadoop.com/m/Kafka/uyzND1jwZrr7HRHf?subj=+
> > > DISCUSS+KIP+98+Exactly+Once+Delivery+and+Transactional+Messaging
> > >
> > > Thanks,
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Reply via email to