Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Jason Gustafson Fri, 03 Feb 2017 14:36:46 -0800

Hi Tom,

I said this in the voting thread, but can the authors include a section
> about new ACLs if there are going to be ACLs for TransactionalId. It's
> mentioned in the google doc, but I think new ACLs should be in a KIP
> directly.



We've updated the wiki. Can you take a look and let us know if you have
additional concerns?

Thanks,
Jason

On Fri, Feb 3, 2017 at 1:52 PM, Rajini Sivaram <[email protected]>
wrote:

> Hi Jason,
>
> Thank you for the responses. Agree that authorizing transactional.id in
> the
> producer requests will be good enough for version 1. And making it tighter
> in future based on delegation tokens sounds good too.
>
> Regards,
>
> Rajini
>
>
> On Fri, Feb 3, 2017 at 8:04 PM, Jason Gustafson <[email protected]>
> wrote:
>
> > Hey Rajini,
> >
> > Thanks for the questions. Responses below:
> >
> >
> > > 1. Will the transaction coordinator check topic ACLs based on the
> > > requesting client's credentials? Access to transaction logs, topics
> being
> > > added for transaction etc?
> >
> >
> > Good question. I think it makes sense to check topic Write permission
> when
> > adding partitions to the transaction. I'll add this to the document.
> > Perhaps authorization to the transaction log itself, however, can be
> > assumed from having access to the ProducerTransactionalId resource? This
> > would be similar to how access to __consumer_offsets is assumed if the
> > client has access to the Group resource.
> >
> > 2. If I create a transactional produce request (by hand, not using the
> > > producer API) with a random PID (random, hence unlikely to be in use),
> > will
> > > the broker append a transactional message to the logs, preventing LSO
> > from
> > > moving forward? What validation will broker do for PIDs?
> >
> >
> > Yes, that is correct. Validation of the TransactionalId to PID binding
> is a
> > known gap in the current proposal, and is discussed in the design
> document.
> > Now that I'm thinking about it a bit more, I think there is a good case
> for
> > including the TransactionalId in the ProduceRequest (I think Jun
> suggested
> > this previously). Verifying it does not ensure that the included PID is
> > correct, but it does ensure that the client is authorized to use
> > transactions. If the client wanted to do an "endless transaction attack,"
> > having Write access to the topic and an authorized transactionalID is all
> > they would need anyway even if we could authorize the PID itself. This
> > seems like a worthwhile improvement.
> >
> > For future work, my half-baked idea to authorize the PID binding is to
> > leverage the delegation work in KIP-48. When the PID is generated, we can
> > give the producer a token which is then used in produce requests (say an
> > hmac covering the TransactionalId and PID).
> >
> >
> > > 3. Will every broker check that a client sending transactional produce
> > > requests at least has write access to transaction log topic since it is
> > not
> > > validating transactional.id (for every produce request)?
> >
> >  4. I understand that brokers cannot authorize the transactional id for
> > each
> > > produce request since requests contain only the PID. But since there
> is a
> > > one-to-one mapping between PID and transactional.id, and a connection
> is
> > > never expected to change its transactional.id, perhaps it is feasible
> to
> > > add authorization and cache the results in the Session? Perhaps not for
> > > version 1, but feels like it will be good to close the security gap
> here.
> > > Obviously it would be simpler if transactional.id was in the produce
> > > request if the overhead was acceptable.
> >
> >
> > I think my response above addresses both of these. We should include the
> > TransactionalId in the ProduceRequest. Of course it need not be included
> in
> > the message format, so I'm not too concerned about the additional
> overhead
> > it adds.
> >
> > Thanks,
> > Jason
> >
> >
> > On Fri, Feb 3, 2017 at 6:52 AM, Ismael Juma <[email protected]> wrote:
> >
> > > Comments inline.
> > >
> > > On Thu, Feb 2, 2017 at 6:28 PM, Jason Gustafson <[email protected]>
> > > wrote:
> > >
> > > > Took me a while to remember why we didn't do this. The timestamp that
> > is
> > > > included at the message set level is the max timestamp of all
> messages
> > in
> > > > the message set as is the case in the current message format (I will
> > > update
> > > > the document to make this explicit). We could make the message
> > timestamps
> > > > relative to the max timestamp, but that makes serialization a bit
> > awkward
> > > > since the timestamps are not assumed to be increasing sequentially or
> > > > monotonically. Once the messages in the message set had been
> > determined,
> > > we
> > > > would need to go back and adjust the relative timestamps.
> > > >
> > >
> > > Yes, I thought this would be a bit tricky and hence why I mentioned the
> > > option of adding a new field at the message set level for the first
> > > timestamp even though that's not ideal either.
> > >
> > > Here's one idea. We let the timestamps in the messages be varints, but
> we
> > > > make their values be relative to the timestamp of the previous
> message,
> > > > with the timestamp of the first message being absolute. For example,
> if
> > > we
> > > > had timestamps 500, 501, 499, then we would write 500 for the first
> > > > message, 1 for the next, and -2 for the final message. Would that
> work?
> > > Let
> > > > me think a bit about it and see if there are any problems.
> > > >
> > >
> > > It's an interesting idea. Comparing to the option of having the first
> > > timestamp in the message set, It's a little more space efficient as we
> > > don't have both a full timestamp in the message set _and_ a varint in
> the
> > > first message (which would always be 0, so we avoid the extra byte) and
> > > also the deltas could be a little smaller in the common case. The main
> > > downside is that it introduces a semantics inconsistency between the
> > first
> > > message and the rest. Not ideal, but maybe we can live with that.
> > >
> > > Ismael
> > >
> >
>

Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Reply via email to