Can I get a review for a documentation update (KAFKA-10473)?

2020-09-24 Thread James Cheng
Hi,

Can I get a review from one of the commiters for this documentation update?

I am adding docs for the following JMX metrics:
kafka.log,type=Log,name=Size
kafka.log,type=Log,name=NumLogSegments
kafka.log,type=Log,name=LogStartOffset
kafka.log,type=Log,name=LogEndOffset


https://issues.apache.org/jira/browse/KAFKA-10473 

https://github.com/apache/kafka/pull/9276 


The pull request page lists lots of failed checks. However, this pull request 
only modifies an HTML file, and the test failures don't seem related to my 
changes.

Thanks,
-James



Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Colin McCabe
On Thu, Sep 24, 2020, at 16:24, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the reply and the updated KIP. A few more comments below.
> 

Hi Jun,

>
> 53. It seems that you already incorporated the changes in KIP-516. With
> topic ids, we don't need to wait for the topic's data to be deleted before
> removing the topic metadata. If the topic is recreated, we can still delete
> the data properly based on the topic id. So, it seems that we can remove
> TopicRecord.Deleting.
> 

Thanks for the reply.  What I was thinking of doing here was using topic IDs 
internally, but still using names externally.  So the topic UUIDs are only for 
the purpose of associating topics with partitions -- from the user's point of 
view, topics are still identified by names.

You're right that KIP-516 will simplify things, but I'm not sure when that will 
land, so I wanted to avoid blocking the initial implementation of this KIP on 
it.

>
> 55. It seems to me that the current behavior where we favor the current
> broker registration is better. This is because uncontrolled broker shutdown
> is rare and its impact is less severe since one just needs to wait for the
> session timeout before restarting the broker. If a mis-configured broker
> replaces an existing broker, the consequence is more severe since it can
> cause the leader to be unavailable, a replica to be out of ISR, and add
> more load on the leaders etc.
> 

Hmm, that's a good point.  Let me check this a bit more before I change it, 
though.

>
> 60. controller.connect=0...@controller0.example.com:9093,
> 1...@controller1.example.com:9093,2...@controller2.example.com : Do we need to
> include the controller id before @? It seems that the host/port is enough
> for establishing the connection. It would also be useful to make it clear
> that controller.connect replaces quorum.voters in KIP-595.
> 

I discussed this with Jason earlier, and he felt that the controller IDs were 
needed in this configuration key.  It is certainly needed when configuring the 
controllers themselves, since they need to know each others' IDs.

>
> 61. I am not sure that we need both controller.listeners and
> controller.connect.security.protocol since the former implies the security
> protocol. The reason that we have both inter.broker.listener.name and
> inter.broker.security.protocol is because we had the latter first and later
> realized that the former is more general.
>

That's a good point.  I've removed this from the KIP. 

>
> 62. I am still not sure that you need controller.listeners to be a list.
> All listeners are already defined in listeners. To migrate from plaintext
> to SSL, one can configure listeners to have both plaintext and SSL. After
> that, one can just change controller.listeners from plaintext to SSL. This
> is similar to how to change the listener for inter broker connections.
> Also, controller.listener.name may be a more accurate name?
> 

The issue that I see here is that if you are running with the controller and 
broker in the same JVM, if you define a few listeners in "listeners" they will 
get used as regular broker listeners, unless you put them in 
controller.listeners.  Therefore, controller.listeners needs to be a list.

controller.listener.names does sound like a better name, though... I've updated 
it to that.

>
> 63. Regarding controller.id, I am trying to understand whether it's a
> required configuration or an optional one. From the KIP, it sounds like
> controller.id is optional. Then, if this is unset, it's not clear how the
> user will obtain the controller id for setting controller.connect.
>

If you specify broker.id but not controller.id, then controller.id will just be 
set to broker.id + 3000.  This was intended to make some configurations a 
little bit more concise to write.  You still do have to know the controller IDs 
when configuring the brokers, though.  If this seems confusing then I can just 
make it mandatory.
 
>
> 64. KIP-516 adds a flag in LeaderAndIsrRequest to indicate whether it
> includes all partitions or not. This will be used to clean up strayed logs.
> I was thinking how we will do the same thing once the controller moves to
> Raft based. One way to do that is on broker startup, it gets the HWM in the
> metadata log from the Raft leader and waits until its metadata catches up
> to HWM. At that point, any local log not seen in the metadata can be
> removed. Since the Fetch response returns the HWM, there seems to be enough
> APIs to achieve this.
> 

That's a very good point.  I added a note about this under Broker Startup.

best,
Colin

>
> Jun
> 
> 
> 
> On Thu, Sep 24, 2020 at 1:07 PM Colin McCabe  wrote:
> 
> > On Mon, Sep 21, 2020, at 18:13, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > Sorry for the late reply. A few more comments below.
> > >
> >
> > Hi Jun,
> >
> > Thanks for taking another look.
> >
> > >
> > > 50. Configurations
> > > 50.1 controller.listeners: It seems that a controller just needs one
> > > listener. 

Jenkins build is back to normal : Kafka » kafka-trunk-jdk8 #86

2020-09-24 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Boyang Chen
Hey Jason and Jun,

thanks for the reply. Actually after some offline discussion, we have seen
hassles around upgrading and downgrading RPCs during redirection, which is
an error-prone approach to coordinate all parties to choose the correct
version to handle. Alternatively, we propose to bring back the
EnvelopeRequest to solve this problem, by embedding the admin request data
inside the request with consistent version. The complete workflow looks
like:

1. broker authorizes all accesses and strips out rejected stuff
2. broker forwards envelope of authorized actions in envelope
3. controller checks cluster_action for envelope request
4. if check passes, then all actions in the request are assumed to be
authorized

Also we need to point out that we are not talking about going backwards.
This workflow restricts the Envelope RPC with cluster_action permission to
reduce the risk of impersonation at best effort. Additionally, we are not
proposing any incompatible changes such as principal serialization. We
shall still use the split and join semantic we built as of current to only
forward authenticated resources.

Let me know if this makes sense.

Boyang

On Thu, Sep 24, 2020 at 4:53 PM Jun Rao  wrote:

> Hi, Jason,
>
> Yes, the most important thing is to be able to avoid two rolling restarts
> in the future. If we have a path to achieve that down the road, the changes
> here are fine.
>
> Thanks,
>
> Jun
>
> On Thu, Sep 24, 2020 at 3:20 PM Jason Gustafson 
> wrote:
>
> > > One of the goals of KIP-584 (feature versioning) is that we can get rid
> > of
> > IBP in the future. So does this change prevent us from removing IBP in
> the
> > future?
> >
> > That is a good question. I think the problem here is that request
> > forwarding puts an expectation on api version support which covers more
> > than one broker. This is why the normal ApiVersions behavior doesn't
> work.
> > I thought about this a bit and haven't come up with a good alternative.
> One
> > thought I've been considering is letting the controller in the
> post-kip-500
> > world set the maximum range of api support for the cluster. However, even
> > then we would need some way to tell when the controller quorum itself is
> > ready to enable support for a new api version. My feeling is that we will
> > probably always need something like the IBP to control when it is safe to
> > expose versions of APIs which have a cross-broker dependence. However,
> > KIP-584 would still allow us to manage the IBP at the level of a feature
> so
> > that we don't need two rolling restarts anymore.
> >
> > Best,
> > Jason
> >
> >
> >
> >
> > On Thu, Sep 24, 2020 at 1:59 PM Jun Rao  wrote:
> >
> > > Hi, Boyang,
> > >
> > > One of the goals of KIP-584 (feature versioning) is that we can get rid
> > of
> > > IBP in the future. So does this change prevent us from removing IBP in
> > the
> > > future?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Boyang,
> > > >
> > > > Thanks for the update. This seems like the best thing we can do. The
> > > > alternative would be to always ensure that the forwarded APIs are
> safe
> > > for
> > > > conversion between versions, but that would restrict the flexibility
> > that
> > > > the versioning is providing. It would also be a large effort to avoid
> > > > introducing regressions through conversion. Sadly this broadens the
> > scope
> > > > of the IBP, but in fact forwarded APIs are inter-broker APIs.
> > > >
> > > > Thanks,
> > > > Jason
> > > >
> > > > On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen <
> > reluctanthero...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hey there,
> > > > >
> > > > > we spotted a necessary case to handle the redirect request
> > versioning,
> > > > and
> > > > > proposed the following changes:
> > > > >
> > > > > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the
> > > corresponding
> > > > > allowed versions in the ApiVersionResponse will be affected by the
> > > entire
> > > > > cluster's versioning, not just the receiving broker, since we need
> to
> > > > > ensure the chosen version get properly handled by all parties. Thus
> > > from
> > > > > now on, RPC with redirection will be treated as inter-broker RPC,
> and
> > > any
> > > > > version bump for these RPCs has to go through IBP bump as well.
> > > > > ApiVersionResponse will take IBP into considerations for the
> > > redirection
> > > > > RPCs allowable versions.
> > > > >
> > > > > 2. We would do the best effort to maintain the same request version
> > for
> > > > > the entire admin client -> receiving broker -> controller broker
> > path,
> > > > but
> > > > > for old RPC versions, they may not have flexible fields introduced
> > yet.
> > > > > Thus, we would have to upgrade the RPC to the minimum version which
> > > > > supports flexible fields
> > > > > and add another tagged field in the header called
> > > > `OriginalRequestVersion`
> > > > > to help the

[jira] [Created] (KAFKA-10527) Voters should always initialize as followers

2020-09-24 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10527:
---

 Summary: Voters should always initialize as followers
 Key: KAFKA-10527
 URL: https://issues.apache.org/jira/browse/KAFKA-10527
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson
Assignee: Jason Gustafson


The current state initialization logic preserves whatever state the broker was 
in when it was shutdown. In particular, if the node was previously a leader, it 
will remain a leader. This can be dangerous if we want to consider 
optimizations such as in KAFKA-10526 since the leader might lose unflushed data 
following the restart. It would be safer to always initialize as a follower so 
that a leader's tenure never crosses process restarts. This helps to guarantee 
the uniqueness of the (offset, epoch) tuple which the replication protocol 
depends on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10526) Explore performance impact of leader fsync deferral

2020-09-24 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10526:
---

 Summary: Explore performance impact of leader fsync deferral
 Key: KAFKA-10526
 URL: https://issues.apache.org/jira/browse/KAFKA-10526
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson


In order to commit a write, a majority of nodes must call fsync in order to 
ensure the data has been written to disk. An interesting optimization option to 
consider is letting the leader defer fsync until the high watermark is ready to 
be advanced. This potentially allows us to reduce the number of flushes on the 
leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Jun Rao
Hi, Jason,

Yes, the most important thing is to be able to avoid two rolling restarts
in the future. If we have a path to achieve that down the road, the changes
here are fine.

Thanks,

Jun

On Thu, Sep 24, 2020 at 3:20 PM Jason Gustafson  wrote:

> > One of the goals of KIP-584 (feature versioning) is that we can get rid
> of
> IBP in the future. So does this change prevent us from removing IBP in the
> future?
>
> That is a good question. I think the problem here is that request
> forwarding puts an expectation on api version support which covers more
> than one broker. This is why the normal ApiVersions behavior doesn't work.
> I thought about this a bit and haven't come up with a good alternative. One
> thought I've been considering is letting the controller in the post-kip-500
> world set the maximum range of api support for the cluster. However, even
> then we would need some way to tell when the controller quorum itself is
> ready to enable support for a new api version. My feeling is that we will
> probably always need something like the IBP to control when it is safe to
> expose versions of APIs which have a cross-broker dependence. However,
> KIP-584 would still allow us to manage the IBP at the level of a feature so
> that we don't need two rolling restarts anymore.
>
> Best,
> Jason
>
>
>
>
> On Thu, Sep 24, 2020 at 1:59 PM Jun Rao  wrote:
>
> > Hi, Boyang,
> >
> > One of the goals of KIP-584 (feature versioning) is that we can get rid
> of
> > IBP in the future. So does this change prevent us from removing IBP in
> the
> > future?
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson 
> > wrote:
> >
> > > Hey Boyang,
> > >
> > > Thanks for the update. This seems like the best thing we can do. The
> > > alternative would be to always ensure that the forwarded APIs are safe
> > for
> > > conversion between versions, but that would restrict the flexibility
> that
> > > the versioning is providing. It would also be a large effort to avoid
> > > introducing regressions through conversion. Sadly this broadens the
> scope
> > > of the IBP, but in fact forwarded APIs are inter-broker APIs.
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen <
> reluctanthero...@gmail.com>
> > > wrote:
> > >
> > > > Hey there,
> > > >
> > > > we spotted a necessary case to handle the redirect request
> versioning,
> > > and
> > > > proposed the following changes:
> > > >
> > > > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the
> > corresponding
> > > > allowed versions in the ApiVersionResponse will be affected by the
> > entire
> > > > cluster's versioning, not just the receiving broker, since we need to
> > > > ensure the chosen version get properly handled by all parties. Thus
> > from
> > > > now on, RPC with redirection will be treated as inter-broker RPC, and
> > any
> > > > version bump for these RPCs has to go through IBP bump as well.
> > > > ApiVersionResponse will take IBP into considerations for the
> > redirection
> > > > RPCs allowable versions.
> > > >
> > > > 2. We would do the best effort to maintain the same request version
> for
> > > > the entire admin client -> receiving broker -> controller broker
> path,
> > > but
> > > > for old RPC versions, they may not have flexible fields introduced
> yet.
> > > > Thus, we would have to upgrade the RPC to the minimum version which
> > > > supports flexible fields
> > > > and add another tagged field in the header called
> > > `OriginalRequestVersion`
> > > > to help the controller broker correctly deserialize the request with
> > the
> > > > original admin client sent out version. We would not downgrade the
> > > original
> > > > request in any circumstance, since the flexible field support is
> > required
> > > > to be open-ended on the high side.
> > > >
> > > > Let me know if you have any questions.
> > > >
> > > > Best,
> > > > Boyang
> > > >
> > > > On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen <
> reluctanthero...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hey there,
> > > > >
> > > > > we are going to introduce a minor change to bump the version of
> > several
> > > > > RPCs which are currently not supporting flexible versions. It is
> > > > necessary
> > > > > because they need to be able to construct request header with
> initial
> > > > > principal name and client id as optional fields for redirection.
> The
> > > are
> > > > > only two of them:
> > > > >
> > > > > 1. AlterConfig
> > > > > 2. AlterClientQuotas
> > > > >
> > > > > Let me know if you have any questions.
> > > > >
> > > > > Boyang
> > > > >
> > > > > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen <
> > > reluctanthero...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hey David,
> > > > >>
> > > > >> After discussing with Colin offline, I would like to correct one
> > case
> > > in
> > > > >> the described workflow, where the CLUSTER_ACTION authorization
> would
> > > > not be
> > > > >> based on 

Re: [VOTE] KIP-671: Add method to Shutdown entire Streams Application

2020-09-24 Thread John Roesler
Thanks for the KIP, Walker!

I’m +1 (binding)

-John

On Mon, Sep 21, 2020, at 17:04, Guozhang Wang wrote:
> Thanks for finalizing the KIP. +1 (binding)
> 
> 
> Guozhang
> 
> On Mon, Sep 21, 2020 at 1:38 PM Walker Carlson 
> wrote:
> 
> > Hello all,
> >
> > I would like to start a thread to vote for KIP-671 to add a method to close
> > all clients in a kafka streams application.
> >
> > KIP:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-671%3A+Shutdown+Streams+Application+when+appropriate+exception+is+thrown
> >
> > Discussion thread: *here
> > <
> > https://mail-archives.apache.org/mod_mbox/kafka-dev/202009.mbox/%3CCAC55fuh3HAGCxz-PzxTJraczy6T-os2oiCV328PBeuJQSVYASg%40mail.gmail.com%3E
> > >*
> >
> > Thanks,
> > -Walker
> >
> 
> 
> -- 
> -- Guozhang
>


[jira] [Resolved] (KAFKA-8836) Add inter-broker protocol to alter ISR

2020-09-24 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8836.

Fix Version/s: 2.7.0
 Assignee: David Arthur  (was: Jason Gustafson)
   Resolution: Fixed

> Add inter-broker protocol to alter ISR
> --
>
> Key: KAFKA-8836
> URL: https://issues.apache.org/jira/browse/KAFKA-8836
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: David Arthur
>Priority: Major
> Fix For: 2.7.0
>
>
> This tracks the implementation of KIP-497: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-497%3A+Add+inter-broker+API+to+alter+ISR].
>  Likely this will be broken down into sub-tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Lucas Bradstreet
> 2. Part of the usage of the file is to have persistent storage of the
topic
ID and use it to compare with the ID supplied in the LeaderAndIsr Request.
There is some discussion in the KIP about changes to the directory
structure, but I believe directory changes were considered to be out of
scope when the KIP was written.


Yeah, I was hoping to get a better understanding of why it was taken out of
scope. Perhaps Lucas Bradstreet might have more insight about the decision.
Basically my point is that we have to create additional infrastructure here
to support the name/id mapping, so I wanted to understand if we just
consider this a sort of tech debt. It would be useful to cover how we
handle the case when this file gets corrupted. Seems like we just have to
trust that it matches whatever the controller tells us and rewrite it?


Hi Jason, Justine,

My thought process is that we will likely need the metadata file whether we
rename the directories or not.
Linux supports filenames of up to 255 bytes and that would not be enough to
support a directory name
 including both the name and topic ID. Given that fact, it seemed
reasonable to add the metadata file
and leave the directory rename until some time in the future (possibly
never).

If the file does get corrupted I think you're right. We would either have
to trust it matches what the controller tells us
 or error out and let an administrator resolve it by checking across
replicas for consistency.

Lucas


On Thu, Sep 24, 2020 at 3:41 PM Jason Gustafson  wrote:

> Thanks Justine. Responses below:
>
> > 1. Yes, the directory will still be based on the topic names.
> LeaderAndIsrRequest is one of the few requests that will still contain the
> topic name. So I think we have this covered. Sorry for confusion.
>
> Ah, you're right. My eyes passed right over the field.
>
> > 2. Part of the usage of the file is to have persistent storage of the
> topic
> ID and use it to compare with the ID supplied in the LeaderAndIsr Request.
> There is some discussion in the KIP about changes to the directory
> structure, but I believe directory changes were considered to be out of
> scope when the KIP was written.
>
> Yeah, I was hoping to get a better understanding of why it was taken out of
> scope. Perhaps Lucas Bradstreet might have more insight about the decision.
> Basically my point is that we have to create additional infrastructure here
> to support the name/id mapping, so I wanted to understand if we just
> consider this a sort of tech debt. It would be useful to cover how we
> handle the case when this file gets corrupted. Seems like we just have to
> trust that it matches whatever the controller tells us and rewrite it?
>
> > 3. I think this is a good point, but I again I wonder about the scope of
> the KIP. Most of the changes mentioned in the KIP are for supporting topic
> deletion and I believe that is why the produce request was listed under
> future work.
>
> That's fair. I brought it up since `Fetch` is already included. If we've
> got `Metadata` and `Fetch`, seems we may as well do `Produce` and save an
> extra kip. No strong objection though if you want to leave it out.
>
>
> -Jason
>
>
> On Thu, Sep 24, 2020 at 3:26 PM Justine Olshan 
> wrote:
>
> > Hi Jason,
> >
> > Thanks for your comments.
> >
> > 1. Yes, the directory will still be based on the topic names.
> > LeaderAndIsrRequest is one of the few requests that will still contain
> the
> > topic name. So I think we have this covered. Sorry for confusion.
> >
> > 2. Part of the usage of the file is to have persistent storage of the
> topic
> > ID and use it to compare with the ID supplied in the LeaderAndIsr
> Request.
> > There is some discussion in the KIP about changes to the directory
> > structure, but I believe directory changes were considered to be out of
> > scope when the KIP was written.
> >
> > 3. I think this is a good point, but I again I wonder about the scope of
> > the KIP. Most of the changes mentioned in the KIP are for supporting
> topic
> > deletion and I believe that is why the produce request was listed under
> > future work.
> >
> > 4. This sounds like it might be a good solution, but I will need to
> discuss
> > more with KIP-500 folks to get the details right.
> >
> > Thanks,
> > Justine
> >
> > On Thu, Sep 24, 2020 at 12:30 PM Jason Gustafson 
> > wrote:
> >
> > > Hi Justine,
> > >
> > > Thanks for picking up this work. I have a few questions/comments:
> > >
> > > 1. It sounds like the directory structure is still going to be based on
> > > topic names. Do I have that right? One complication is that the
> > > LeaderAndIsr request does not include the topic name any longer. This
> > means
> > > that a replica must wait for the UpdateMetadata request to arrive with
> > the
> > > topic name mapping before it can create the directory. I wonder if it
> > would
> > > be simpler to avoid assumptions on the ordering of UpdateMetadata and
> let
> > > LeaderAndIsr include the topic name a

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Jun Rao
Hi, Colin,

Thanks for the reply and the updated KIP. A few more comments below.

53. It seems that you already incorporated the changes in KIP-516. With
topic ids, we don't need to wait for the topic's data to be deleted before
removing the topic metadata. If the topic is recreated, we can still delete
the data properly based on the topic id. So, it seems that we can remove
TopicRecord.Deleting.

55. It seems to me that the current behavior where we favor the current
broker registration is better. This is because uncontrolled broker shutdown
is rare and its impact is less severe since one just needs to wait for the
session timeout before restarting the broker. If a mis-configured broker
replaces an existing broker, the consequence is more severe since it can
cause the leader to be unavailable, a replica to be out of ISR, and add
more load on the leaders etc.

60. controller.connect=0...@controller0.example.com:9093,
1...@controller1.example.com:9093,2...@controller2.example.com : Do we need to
include the controller id before @? It seems that the host/port is enough
for establishing the connection. It would also be useful to make it clear
that controller.connect replaces quorum.voters in KIP-595.

61. I am not sure that we need both controller.listeners and
controller.connect.security.protocol since the former implies the security
protocol. The reason that we have both inter.broker.listener.name and
inter.broker.security.protocol is because we had the latter first and later
realized that the former is more general.

62. I am still not sure that you need controller.listeners to be a list.
All listeners are already defined in listeners. To migrate from plaintext
to SSL, one can configure listeners to have both plaintext and SSL. After
that, one can just change controller.listeners from plaintext to SSL. This
is similar to how to change the listener for inter broker connections.
Also, controller.listener.name may be a more accurate name?

63. Regarding controller.id, I am trying to understand whether it's a
required configuration or an optional one. From the KIP, it sounds like
controller.id is optional. Then, if this is unset, it's not clear how the
user will obtain the controller id for setting controller.connect.

64. KIP-516 adds a flag in LeaderAndIsrRequest to indicate whether it
includes all partitions or not. This will be used to clean up strayed logs.
I was thinking how we will do the same thing once the controller moves to
Raft based. One way to do that is on broker startup, it gets the HWM in the
metadata log from the Raft leader and waits until its metadata catches up
to HWM. At that point, any local log not seen in the metadata can be
removed. Since the Fetch response returns the HWM, there seems to be enough
APIs to achieve this.

Jun



On Thu, Sep 24, 2020 at 1:07 PM Colin McCabe  wrote:

> On Mon, Sep 21, 2020, at 18:13, Jun Rao wrote:
> > Hi, Colin,
> >
> > Sorry for the late reply. A few more comments below.
> >
>
> Hi Jun,
>
> Thanks for taking another look.
>
> >
> > 50. Configurations
> > 50.1 controller.listeners: It seems that a controller just needs one
> > listener. Why do we need to have a list here? Also, could you provide an
> > example of how this is set and what's its relationship with existing
> > configs such as "security.inter.broker.protocol" and "
> > inter.broker.listener.name"?
> >
>
> I agree that most administrators will want to run with only one controller
> listener.  However, just as with brokers, it is nice to have the option to
> expose multiple ports.
>
> One reason why you might want multiple ports is if you were doing a
> migration from plaintext to SSL.  You could add new SSL listeners but
> continue to expose the PLAINTEXT listeners.  Then you could add the new SSL
> controller config to each broker and roll each broker.  Then you could
> migrate from PLAINTEXT to SSL with no downtime.
>
> Here's an example configuration for the controller:
> controller.connect=0...@controller0.example.com:9093,
> 1...@controller1.example.com:9093,2...@controller2.example.com
> controller.listeners=CONTROLLER
> listeners=CONTROLLER://controller0.example.com:9093
> listener.security.protocol.map=CONTROLLER:SSL
>
> Here's an example configuration for the broker:
> controller.connect=0...@controller0.example.com:9093,
> 1...@controller1.example.com:9093,2...@controller2.example.com
> controller.connect.security.protocol=SSL
>
> security.inter.broker.protocol or inter.broker.listener.name do not
> affect how the broker communicates with the controller.  Those
> configurations relate to how brokers communicate with each other, but the
> controller is not a broker.
>
> (Note that I just added controller.connect.security.protocol to the KIP --
> I had forgotten to put it in earlier)
>
> >
> > 50.2 registration.heartbeat.interval.ms and
> registration.lease.timeout.ms.
> > Should we match their default value with the corresponding default for
> ZK?
> >
>
> Fair enough.  I'll set them

[DISCUSS] KIP-673: Emit JSONs with new auto-generated schema

2020-09-24 Thread Anastasia Vela
Hi all,

I'd like to discuss KIP-673:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema

This is a proposal to change the format of request and response traces to
JSON, which would be easier to load and parse, because the current format
is only JSON-like and not easily parsable.

Let me know what you think,
Anastasia


Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Jason Gustafson
Thanks Justine. Responses below:

> 1. Yes, the directory will still be based on the topic names.
LeaderAndIsrRequest is one of the few requests that will still contain the
topic name. So I think we have this covered. Sorry for confusion.

Ah, you're right. My eyes passed right over the field.

> 2. Part of the usage of the file is to have persistent storage of the
topic
ID and use it to compare with the ID supplied in the LeaderAndIsr Request.
There is some discussion in the KIP about changes to the directory
structure, but I believe directory changes were considered to be out of
scope when the KIP was written.

Yeah, I was hoping to get a better understanding of why it was taken out of
scope. Perhaps Lucas Bradstreet might have more insight about the decision.
Basically my point is that we have to create additional infrastructure here
to support the name/id mapping, so I wanted to understand if we just
consider this a sort of tech debt. It would be useful to cover how we
handle the case when this file gets corrupted. Seems like we just have to
trust that it matches whatever the controller tells us and rewrite it?

> 3. I think this is a good point, but I again I wonder about the scope of
the KIP. Most of the changes mentioned in the KIP are for supporting topic
deletion and I believe that is why the produce request was listed under
future work.

That's fair. I brought it up since `Fetch` is already included. If we've
got `Metadata` and `Fetch`, seems we may as well do `Produce` and save an
extra kip. No strong objection though if you want to leave it out.


-Jason


On Thu, Sep 24, 2020 at 3:26 PM Justine Olshan  wrote:

> Hi Jason,
>
> Thanks for your comments.
>
> 1. Yes, the directory will still be based on the topic names.
> LeaderAndIsrRequest is one of the few requests that will still contain the
> topic name. So I think we have this covered. Sorry for confusion.
>
> 2. Part of the usage of the file is to have persistent storage of the topic
> ID and use it to compare with the ID supplied in the LeaderAndIsr Request.
> There is some discussion in the KIP about changes to the directory
> structure, but I believe directory changes were considered to be out of
> scope when the KIP was written.
>
> 3. I think this is a good point, but I again I wonder about the scope of
> the KIP. Most of the changes mentioned in the KIP are for supporting topic
> deletion and I believe that is why the produce request was listed under
> future work.
>
> 4. This sounds like it might be a good solution, but I will need to discuss
> more with KIP-500 folks to get the details right.
>
> Thanks,
> Justine
>
> On Thu, Sep 24, 2020 at 12:30 PM Jason Gustafson 
> wrote:
>
> > Hi Justine,
> >
> > Thanks for picking up this work. I have a few questions/comments:
> >
> > 1. It sounds like the directory structure is still going to be based on
> > topic names. Do I have that right? One complication is that the
> > LeaderAndIsr request does not include the topic name any longer. This
> means
> > that a replica must wait for the UpdateMetadata request to arrive with
> the
> > topic name mapping before it can create the directory. I wonder if it
> would
> > be simpler to avoid assumptions on the ordering of UpdateMetadata and let
> > LeaderAndIsr include the topic name as well. Feels like we are not saving
> > that much by excluding it.
> >
> > 2. On a related note, it seems that the reason we have the partition
> > metadata file is because we are not changing the directory structure. We
> > need it so that we remember which directories map to which topic id. I
> > think it would be useful to add some detail about why changing the
> > directory layout was rejected. Basically trying to understand how much
> > effort we are saving by skipping this step if we have to implement this
> > additional bookkeeping. One concern I have for example is that the
> > partition metadata file gets corrupt and then all bets are off as far as
> > topic consistency.
> >
> > 3. Is it worth adding support now for topic ids in the `Produce` request
> > now? Overhead is mentioned as one of the motivations and the best APIs to
> > get savings from are `Produce` and `Fetch`.
> >
> > 4. When it comes to bootstrapping the metadata topic with respect to
> > KIP-500, one suggestion would be to reserve a sentinel topic ID which can
> > be used initially by new brokers. When the first controller is elected,
> it
> > can assign a topicId to the metadata topic. This is a bit similar to how
> we
> > were planning to generate the clusterId.
> >
> > Thanks,
> > Jason
> >
> >
> > On Thu, Sep 24, 2020 at 11:10 AM Jun Rao  wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the updated KIP. A few more comments below.
> > >
> > > 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc
> > should
> > > say UUID.
> > >
> > > 31. LeaderAndIsrResponse v5 and StopReplicaResponse v4 : It seems there
> > is
> > > no need to add topic_id at partitions level.
> > >
> > 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Justine Olshan
Hi Jason,

Thanks for your comments.

1. Yes, the directory will still be based on the topic names.
LeaderAndIsrRequest is one of the few requests that will still contain the
topic name. So I think we have this covered. Sorry for confusion.

2. Part of the usage of the file is to have persistent storage of the topic
ID and use it to compare with the ID supplied in the LeaderAndIsr Request.
There is some discussion in the KIP about changes to the directory
structure, but I believe directory changes were considered to be out of
scope when the KIP was written.

3. I think this is a good point, but I again I wonder about the scope of
the KIP. Most of the changes mentioned in the KIP are for supporting topic
deletion and I believe that is why the produce request was listed under
future work.

4. This sounds like it might be a good solution, but I will need to discuss
more with KIP-500 folks to get the details right.

Thanks,
Justine

On Thu, Sep 24, 2020 at 12:30 PM Jason Gustafson  wrote:

> Hi Justine,
>
> Thanks for picking up this work. I have a few questions/comments:
>
> 1. It sounds like the directory structure is still going to be based on
> topic names. Do I have that right? One complication is that the
> LeaderAndIsr request does not include the topic name any longer. This means
> that a replica must wait for the UpdateMetadata request to arrive with the
> topic name mapping before it can create the directory. I wonder if it would
> be simpler to avoid assumptions on the ordering of UpdateMetadata and let
> LeaderAndIsr include the topic name as well. Feels like we are not saving
> that much by excluding it.
>
> 2. On a related note, it seems that the reason we have the partition
> metadata file is because we are not changing the directory structure. We
> need it so that we remember which directories map to which topic id. I
> think it would be useful to add some detail about why changing the
> directory layout was rejected. Basically trying to understand how much
> effort we are saving by skipping this step if we have to implement this
> additional bookkeeping. One concern I have for example is that the
> partition metadata file gets corrupt and then all bets are off as far as
> topic consistency.
>
> 3. Is it worth adding support now for topic ids in the `Produce` request
> now? Overhead is mentioned as one of the motivations and the best APIs to
> get savings from are `Produce` and `Fetch`.
>
> 4. When it comes to bootstrapping the metadata topic with respect to
> KIP-500, one suggestion would be to reserve a sentinel topic ID which can
> be used initially by new brokers. When the first controller is elected, it
> can assign a topicId to the metadata topic. This is a bit similar to how we
> were planning to generate the clusterId.
>
> Thanks,
> Jason
>
>
> On Thu, Sep 24, 2020 at 11:10 AM Jun Rao  wrote:
>
> > Hi, Justine,
> >
> > Thanks for the updated KIP. A few more comments below.
> >
> > 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc
> should
> > say UUID.
> >
> > 31. LeaderAndIsrResponse v5 and StopReplicaResponse v4 : It seems there
> is
> > no need to add topic_id at partitions level.
> >
> > 32. Regarding partition metadata file. Perhaps the key can be a single
> > word, sth like the following.
> > version: 0
> > topic_id: 46bdb63f-9e8d-4a38-bf7b-ee4eb2a794e4
> >
> > 33. Another tricky thing that I realized is how to support the metadata
> > topic introduced in KIP-595. It's a bit tricky to assign a UUID to the
> > metadata topic since we have a chicken-and-egg problem. The controller
> > needs to persist the UUID in the metadata topic in order to assign one
> > successfully, but the metadata topic is needed to elect a controller
> > (KIP-631). So, this probably needs a bit more thought.
> >
> > Jun
> >
> > On Thu, Sep 24, 2020 at 3:04 AM Ismael Juma  wrote:
> >
> > > Also, can we provide more details on how the Partition Metadata file
> will
> > > be used?
> > >
> > > Ismael
> > >
> > > On Thu, Sep 24, 2020 at 3:01 AM Ismael Juma  wrote:
> > >
> > > > Hi Justine,
> > > >
> > > > I think we need to update the "Rejected Alternatives" section to take
> > > into
> > > > account that the proposal now removes the topic name from the fetch
> > > > request. Also, if we are removing it from the Fetch request, does it
> > make
> > > > sense not to remove it from similar requests like ListOffsetRequest?
> > > >
> > > > Ismael
> > > >
> > > > On Thu, Sep 24, 2020 at 2:46 AM David Jacot 
> > wrote:
> > > >
> > > >> Hi Justine,
> > > >>
> > > >> Thanks for the KIP. I finally had time to read it :). It is a great
> > > >> improvement.
> > > >>
> > > >> I have a few comments/questions:
> > > >>
> > > >> 1. It seems that the schema of the StopReplicaRequest is slightly
> > > >> outdated.
> > > >> We
> > > >> did some changes as part of KIP-570. V3 is already organized by
> > topics.
> > > >>
> > > >> 2. I just want to make sure that I understand the reconciliation
> > > >> logic corr

Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Jason Gustafson
> One of the goals of KIP-584 (feature versioning) is that we can get rid of
IBP in the future. So does this change prevent us from removing IBP in the
future?

That is a good question. I think the problem here is that request
forwarding puts an expectation on api version support which covers more
than one broker. This is why the normal ApiVersions behavior doesn't work.
I thought about this a bit and haven't come up with a good alternative. One
thought I've been considering is letting the controller in the post-kip-500
world set the maximum range of api support for the cluster. However, even
then we would need some way to tell when the controller quorum itself is
ready to enable support for a new api version. My feeling is that we will
probably always need something like the IBP to control when it is safe to
expose versions of APIs which have a cross-broker dependence. However,
KIP-584 would still allow us to manage the IBP at the level of a feature so
that we don't need two rolling restarts anymore.

Best,
Jason




On Thu, Sep 24, 2020 at 1:59 PM Jun Rao  wrote:

> Hi, Boyang,
>
> One of the goals of KIP-584 (feature versioning) is that we can get rid of
> IBP in the future. So does this change prevent us from removing IBP in the
> future?
>
> Thanks,
>
> Jun
>
> On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson 
> wrote:
>
> > Hey Boyang,
> >
> > Thanks for the update. This seems like the best thing we can do. The
> > alternative would be to always ensure that the forwarded APIs are safe
> for
> > conversion between versions, but that would restrict the flexibility that
> > the versioning is providing. It would also be a large effort to avoid
> > introducing regressions through conversion. Sadly this broadens the scope
> > of the IBP, but in fact forwarded APIs are inter-broker APIs.
> >
> > Thanks,
> > Jason
> >
> > On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen 
> > wrote:
> >
> > > Hey there,
> > >
> > > we spotted a necessary case to handle the redirect request versioning,
> > and
> > > proposed the following changes:
> > >
> > > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the
> corresponding
> > > allowed versions in the ApiVersionResponse will be affected by the
> entire
> > > cluster's versioning, not just the receiving broker, since we need to
> > > ensure the chosen version get properly handled by all parties. Thus
> from
> > > now on, RPC with redirection will be treated as inter-broker RPC, and
> any
> > > version bump for these RPCs has to go through IBP bump as well.
> > > ApiVersionResponse will take IBP into considerations for the
> redirection
> > > RPCs allowable versions.
> > >
> > > 2. We would do the best effort to maintain the same request version for
> > > the entire admin client -> receiving broker -> controller broker path,
> > but
> > > for old RPC versions, they may not have flexible fields introduced yet.
> > > Thus, we would have to upgrade the RPC to the minimum version which
> > > supports flexible fields
> > > and add another tagged field in the header called
> > `OriginalRequestVersion`
> > > to help the controller broker correctly deserialize the request with
> the
> > > original admin client sent out version. We would not downgrade the
> > original
> > > request in any circumstance, since the flexible field support is
> required
> > > to be open-ended on the high side.
> > >
> > > Let me know if you have any questions.
> > >
> > > Best,
> > > Boyang
> > >
> > > On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen  >
> > > wrote:
> > >
> > > > Hey there,
> > > >
> > > > we are going to introduce a minor change to bump the version of
> several
> > > > RPCs which are currently not supporting flexible versions. It is
> > > necessary
> > > > because they need to be able to construct request header with initial
> > > > principal name and client id as optional fields for redirection. The
> > are
> > > > only two of them:
> > > >
> > > > 1. AlterConfig
> > > > 2. AlterClientQuotas
> > > >
> > > > Let me know if you have any questions.
> > > >
> > > > Boyang
> > > >
> > > > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen <
> > reluctanthero...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hey David,
> > > >>
> > > >> After discussing with Colin offline, I would like to correct one
> case
> > in
> > > >> the described workflow, where the CLUSTER_ACTION authorization would
> > > not be
> > > >> based on the initial principal field check, because it is not a
> > secured
> > > >> condition which anyone could forge. The revised workflow shall be:
> > > >>
> > > >> Step 1. Filter out resources that are authorized
> > > >>  1.1 Use traditional principals to verify first. If
> > authorized,
> > > >> continue
> > > >>  1.2 If not authorized, check whether the request is from
> the
> > > >> control plane. Note that this is a best-effort to verify whether the
> > > >> request is internal.
> > > >>  1.3 If the request is not from the control plane, return
> > > >> aut

[jira] [Created] (KAFKA-10525) Emit JSONs with new auto-generated schema

2020-09-24 Thread Anastasia Vela (Jira)
Anastasia Vela created KAFKA-10525:
--

 Summary: Emit JSONs with new auto-generated schema
 Key: KAFKA-10525
 URL: https://issues.apache.org/jira/browse/KAFKA-10525
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Anastasia Vela
Assignee: Anastasia Vela


Kafka’s request and response traces currently output in a format that is 
JSON-like and are not easily parsable. These are currently emitted by 
RequestChannel when logging is turned on at DEBUG level. Structured logs will 
be easier to load and parse with other tools like jq, elasticsearch, druid or 
presto. 

There is a new auto-generated schema for each request type that supports 
outputting JSON payloads for request and response payloads. These can be 
adapted to provide structured request tracing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Jun Rao
Hi, Boyang,

One of the goals of KIP-584 (feature versioning) is that we can get rid of
IBP in the future. So does this change prevent us from removing IBP in the
future?

Thanks,

Jun

On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson  wrote:

> Hey Boyang,
>
> Thanks for the update. This seems like the best thing we can do. The
> alternative would be to always ensure that the forwarded APIs are safe for
> conversion between versions, but that would restrict the flexibility that
> the versioning is providing. It would also be a large effort to avoid
> introducing regressions through conversion. Sadly this broadens the scope
> of the IBP, but in fact forwarded APIs are inter-broker APIs.
>
> Thanks,
> Jason
>
> On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen 
> wrote:
>
> > Hey there,
> >
> > we spotted a necessary case to handle the redirect request versioning,
> and
> > proposed the following changes:
> >
> > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the corresponding
> > allowed versions in the ApiVersionResponse will be affected by the entire
> > cluster's versioning, not just the receiving broker, since we need to
> > ensure the chosen version get properly handled by all parties. Thus from
> > now on, RPC with redirection will be treated as inter-broker RPC, and any
> > version bump for these RPCs has to go through IBP bump as well.
> > ApiVersionResponse will take IBP into considerations for the redirection
> > RPCs allowable versions.
> >
> > 2. We would do the best effort to maintain the same request version for
> > the entire admin client -> receiving broker -> controller broker path,
> but
> > for old RPC versions, they may not have flexible fields introduced yet.
> > Thus, we would have to upgrade the RPC to the minimum version which
> > supports flexible fields
> > and add another tagged field in the header called
> `OriginalRequestVersion`
> > to help the controller broker correctly deserialize the request with the
> > original admin client sent out version. We would not downgrade the
> original
> > request in any circumstance, since the flexible field support is required
> > to be open-ended on the high side.
> >
> > Let me know if you have any questions.
> >
> > Best,
> > Boyang
> >
> > On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen 
> > wrote:
> >
> > > Hey there,
> > >
> > > we are going to introduce a minor change to bump the version of several
> > > RPCs which are currently not supporting flexible versions. It is
> > necessary
> > > because they need to be able to construct request header with initial
> > > principal name and client id as optional fields for redirection. The
> are
> > > only two of them:
> > >
> > > 1. AlterConfig
> > > 2. AlterClientQuotas
> > >
> > > Let me know if you have any questions.
> > >
> > > Boyang
> > >
> > > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen <
> reluctanthero...@gmail.com
> > >
> > > wrote:
> > >
> > >> Hey David,
> > >>
> > >> After discussing with Colin offline, I would like to correct one case
> in
> > >> the described workflow, where the CLUSTER_ACTION authorization would
> > not be
> > >> based on the initial principal field check, because it is not a
> secured
> > >> condition which anyone could forge. The revised workflow shall be:
> > >>
> > >> Step 1. Filter out resources that are authorized
> > >>  1.1 Use traditional principals to verify first. If
> authorized,
> > >> continue
> > >>  1.2 If not authorized, check whether the request is from the
> > >> control plane. Note that this is a best-effort to verify whether the
> > >> request is internal.
> > >>  1.3 If the request is not from the control plane, return
> > >> authorization failure
> > >>  1.4 If the request is from the control plane, use
> > CLUSTER_ACTION
> > >> to verify and determine the result
> > >>
> > >> Step 2. Check the request context to see if this is a forwarding
> > request,
> > >> by checking whether it is from control plane and uses extra header
> > fields
> > >> 2.1 if the resource is authorized, and if this is the active
> > >> controller, process it
> > >> 2.2 if the resource is authorized but this is not the active
> > >> controller, return NOT_CONTROLLER to the sender (forwarding broker)
> for
> > >> retry
> > >> 2.3 if the resource is not authorized, return
> > >> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the original client
> > >> through forwarding broker
> > >> Step 3. If the request is not a forwarding request
> > >> 3.1 If the resource is authorized, and this is the active
> > >> controller, process it
> > >> 3.2 If the resource is authorized, but this is not active
> > >> controller, put the resource into the preparation for a new
> AlterConfig
> > >> request for forwarding
> > >> 3.3 If the resource is not authorized, reply the original
> client
> > >> AUTHORIZATION_FAILURE when the forwarding request is returned
> > >>
> > >> On Thu, Jul 30, 2020 at 3:47 PM B

Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-09-24 Thread Jason Gustafson
Thanks Jose. Makes sense overall. A few specific responses below:

> Generally the number of snapshots on disk will be one. I suspect that
users will want some control over this. We can add a configuration
option that doesn't delete, or advances the log begin offset past, the
N latest snapshots. We can set the default value for this
configuration to two. What do you think?

I know Zookeeper has a config like this, but I'm not sure how frequently it
is used. I would probably suggest we pick a good number of snapshots (maybe
just 1-2) and leave it out of the configs.

> We could use the same configuration we have for Fetch but to avoid
confusion let's add two more configurations for
"replica.fetch.snapshot.max.bytes" and
"replica.fetch.snapshot.response.max.bytes".

My vote would probably be to reuse the existing configs. We can add new
configs in the future if the need emerges, but I can't think of a good
reason why a user would want these to be different.

By the way, it looks like the FetchSnapshot schema now has both a partition
level and a top level max bytes. Do we need both?

> The snapshot epoch will be used when ordering snapshots and more
importantly when setting the LastFetchedEpoch in the Fetch request. It
is possible for a follower to have a snapshot and an empty log. In
this case the follower will use the epoch of the snapshot when setting
the LastFetchEpoch in the Fetch request.

Just to be clear, I think it is important to include the snapshot epoch so
that we /can/ reason about the snapshot state in the presence of data loss.
However, if we excluded data loss, then this would strictly speaking be
unnecessary because a snapshot offset would always be uniquely defined
(since we do not snapshot above the high watermark). Hence it would be safe
to leave LastFetchedEpoch undefined. Anyway, I think we're on the same page
about the behavior, just thought it might be useful to clarify the
reasoning.

Thanks,
Jason



On Thu, Sep 24, 2020 at 1:19 PM Jose Garcia Sancio 
wrote:

> Thanks for the feedback Jason.
>
> I have made the following changes to the KIP:
> 1. Better explanation of how followers will manage snapshots and the
> replicated log. This includes the necessary changes when granting or
> requesting votes.
> 2. How the Snapshot's epoch will be used for the LastFetchEpoch in the
> Fetch request.
> 3. New configuration options.
> 4. Changed the Fetch response to match the latest changes in KIP-595.
> 5. Changed the FetchSnapshot request to include total response max bytes.
> 6. Changed the FetchSnapshot response to return the snapshot size
> instead of the "Continue" field.
>
> Diff:
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763&selectedPageVersions=24&selectedPageVersions=23
>
> > 1. There is a comment in the proposal which suggests that we will
> maintain
> > multiple snapshots:
> >
> > > Having multiple snapshots is useful for minimizing re-fetching of the
> > snapshot when a new snapshot is generated.
> >
> > However, the document later says that snapshots get deleted as the LBO
> > advances. Just wanted to clarify the intent. Will we generally only have
> > one snapshot?
>
> Generally the number of snapshots on disk will be one. I suspect that
> users will want some control over this. We can add a configuration
> option that doesn't delete, or advances the log begin offset past, the
> N latest snapshots. We can set the default value for this
> configuration to two. What do you think?
>
> > 2. The proposal says the following:
> >
> > > During leader election, followers with incomplete or missing snapshot
> > will send a vote request and response as if they had an empty log.
> >
> > Maybe you can help me understand the scenario we're talking about since
> I'm
> > not sure I understand the point of this. If the intent is to not allow
> such
> > a follower to become leader, why would it ever become a candidate? On the
> > other hand, if the intent is to still allow it to become leader in some
> > disaster scenario, then why would it not use its latest log state? For
> > inbound Vote requests, I think it should definitely still consider its
> > latest log state when deciding whether to grant a vote.
>
> Conceptually followers will implement this algorithm:
> 1. Follower sends fetch request
> 2. Leader replies with snapshot epoch and offset
> 3. Follower pauses fetch
> 4. Follower fetches the snapshot
> 5. Follower resume fetch by
> A. Setting the LBO to the snapshot offset plus one
> B. Setting the LEO or fetch offset in the fetch request to the
> snapshot offset plus one
> C. Uses the snapshot epoch as the last fetched epoch in the fetch
> request.
>
> The problem I was trying to address is what is the state of the
> follower between bullet 4 and 5? Let's assume that the snapshot fetch
> in bullet 4 has an epoch of E and an offset of O. The follower can
> have the following state on disk after bullet 4:
>
> 1. A snapshot with offset O a

Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-09-24 Thread Jose Garcia Sancio
Thanks for the feedback Jason.

I have made the following changes to the KIP:
1. Better explanation of how followers will manage snapshots and the
replicated log. This includes the necessary changes when granting or
requesting votes.
2. How the Snapshot's epoch will be used for the LastFetchEpoch in the
Fetch request.
3. New configuration options.
4. Changed the Fetch response to match the latest changes in KIP-595.
5. Changed the FetchSnapshot request to include total response max bytes.
6. Changed the FetchSnapshot response to return the snapshot size
instead of the "Continue" field.

Diff:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763&selectedPageVersions=24&selectedPageVersions=23

> 1. There is a comment in the proposal which suggests that we will maintain
> multiple snapshots:
>
> > Having multiple snapshots is useful for minimizing re-fetching of the
> snapshot when a new snapshot is generated.
>
> However, the document later says that snapshots get deleted as the LBO
> advances. Just wanted to clarify the intent. Will we generally only have
> one snapshot?

Generally the number of snapshots on disk will be one. I suspect that
users will want some control over this. We can add a configuration
option that doesn't delete, or advances the log begin offset past, the
N latest snapshots. We can set the default value for this
configuration to two. What do you think?

> 2. The proposal says the following:
>
> > During leader election, followers with incomplete or missing snapshot
> will send a vote request and response as if they had an empty log.
>
> Maybe you can help me understand the scenario we're talking about since I'm
> not sure I understand the point of this. If the intent is to not allow such
> a follower to become leader, why would it ever become a candidate? On the
> other hand, if the intent is to still allow it to become leader in some
> disaster scenario, then why would it not use its latest log state? For
> inbound Vote requests, I think it should definitely still consider its
> latest log state when deciding whether to grant a vote.

Conceptually followers will implement this algorithm:
1. Follower sends fetch request
2. Leader replies with snapshot epoch and offset
3. Follower pauses fetch
4. Follower fetches the snapshot
5. Follower resume fetch by
A. Setting the LBO to the snapshot offset plus one
B. Setting the LEO or fetch offset in the fetch request to the
snapshot offset plus one
C. Uses the snapshot epoch as the last fetched epoch in the fetch request.

The problem I was trying to address is what is the state of the
follower between bullet 4 and 5? Let's assume that the snapshot fetch
in bullet 4 has an epoch of E and an offset of O. The follower can
have the following state on disk after bullet 4:

1. A snapshot with offset O and epoch E.
2. Many snapshots older/less than offset O and epoch E.
3. A replicated log with LEO older/less than offset O and epoch E.

In this case when the follower grants a vote or becomes a candidate it
should use the latest of all of this which is (1.) the fetched
snapshot with offset O and epoch E.

I updated the KIP to include this description.

> 3. Are we overloading `replica.fetch.max.bytes` for snapshot fetches as
> well? It looks like we are specifying this at the partition level, but it
> might be more useful to track the maximum bytes at the request level. On a
> related note, it might be useful to think through heuristics for balancing
> between the requests in a partition. Unlike fetches, it seems like we'd
> want to complete snapshot loading partition by partition. I wonder if it
> would be simpler for FetchSnapshot to handle just one partition.

We could use the same configuration we have for Fetch but to avoid
confusion let's add two more configurations for
"replica.fetch.snapshot.max.bytes" and
"replica.fetch.snapshot.response.max.bytes".

> 4. It would help if the document motivated the need to track the snapshot
> epoch. Since we are only snapshotting below the high watermark, are you
> thinking about recovering from data loss scenarios?

I added the following paragraph to the KIP:

The snapshot epoch will be used when ordering snapshots and more
importantly when setting the LastFetchedEpoch in the Fetch request. It
is possible for a follower to have a snapshot and an empty log. In
this case the follower will use the epoch of the snapshot when setting
the LastFetchEpoch in the Fetch request.

>
> 5. Might need to fix the following:
>
> > Otherwise, the leader will respond with the offset and epoch of the
> latest snapshot (y, c) and with the next fetch offset and epoch (y + 1, d)
>
> We ended up renaming the next fetch offset and epoch. I think we should
> just leave it empty in this case. The snapshot offset and epoch seem
> sufficient.

Done. I made some changes to the "Handling Fetch Response" section too.


Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-24 Thread Colin McCabe
On Mon, Sep 21, 2020, at 18:13, Jun Rao wrote:
> Hi, Colin,
> 
> Sorry for the late reply. A few more comments below.
> 

Hi Jun,

Thanks for taking another look.

>
> 50. Configurations
> 50.1 controller.listeners: It seems that a controller just needs one
> listener. Why do we need to have a list here? Also, could you provide an
> example of how this is set and what's its relationship with existing
> configs such as "security.inter.broker.protocol" and "
> inter.broker.listener.name"?
>

I agree that most administrators will want to run with only one controller 
listener.  However, just as with brokers, it is nice to have the option to 
expose multiple ports.

One reason why you might want multiple ports is if you were doing a migration 
from plaintext to SSL.  You could add new SSL listeners but continue to expose 
the PLAINTEXT listeners.  Then you could add the new SSL controller config to 
each broker and roll each broker.  Then you could migrate from PLAINTEXT to SSL 
with no downtime.

Here's an example configuration for the controller:
controller.connect=0...@controller0.example.com:9093,1...@controller1.example.com:9093,2...@controller2.example.com
controller.listeners=CONTROLLER
listeners=CONTROLLER://controller0.example.com:9093
listener.security.protocol.map=CONTROLLER:SSL

Here's an example configuration for the broker:
controller.connect=0...@controller0.example.com:9093,1...@controller1.example.com:9093,2...@controller2.example.com
controller.connect.security.protocol=SSL

security.inter.broker.protocol or inter.broker.listener.name do not affect how 
the broker communicates with the controller.  Those configurations relate to 
how brokers communicate with each other, but the controller is not a broker.

(Note that I just added controller.connect.security.protocol to the KIP -- I 
had forgotten to put it in earlier)

>
> 50.2 registration.heartbeat.interval.ms and registration.lease.timeout.ms.
> Should we match their default value with the corresponding default for ZK?
>

Fair enough.  I'll set them to the values of the zookeeper.sync.time.ms and 
zookeeper.connection.timeout.ms configurations.  I do think we should 
experiment later on to see what works well here, but the ZK values are at least 
a starting point.

>
> 50.3 controller.connect: Could you provide an example? I am wondering how
> it differs from quorum.voters=1@kafka-1:9092, 2@kafka-2:9092, 3@kafka-3
> :9092.
>

controller.connect is intended to be the new name for quorum.voters.  During 
the vote for KIP-595, we sort of agreed to defer the discussion about what this 
configuration should be called.  I proposed this new name because it makes it 
clear what the configuration is for (how to connect to the controllers).

>
> 50.4 controller.id: I am still not sure how this is being used. Could you
> explain this in more detail?
>

The controller ID needs to be set on each controller.  It also appears in 
controller.connect as the thing before the "at" sign.  Its function is pretty 
similar to broker.id.

Broker IDs and controller IDs exist in the same ID space, so you can't have a 
broker and a controller that use the same ID.

>
> 51. BrokerHeartbeat: It seems a bit wasteful to include Listeners in every
> heartbeat request since it typically doesn't change. Could we make that an
> optional field?
>

Ok.

>
> 52. KIP-584 adds a new ZK node /features. Should we add a corresponding
> metadata record?
> 

Good point.  I added FeatureLevelRecord.

>
> 53. TopicRecord and DeleteTopic: Both DeleteTopic and TopicRecord.Deleting
> indicate topic deletion. Could we outline the flow when each will be set?
> In particular, which one indicates the intention to delete and which one
> indicates the completion of the deletion.
>

TopicRecord.Deleting is set when we intend to delete the topic.

DeleteTopic removes the topic completely.  I will rename DeleteTopic to 
RemoveTopic to make this clearer.

> 
> 54. "The controller can generate a new broker epoch by using the latest log
> offset." Which offset is that? Is it the offset of the metadata topic for
> the corresponding BrokerRecord? Is it guaranteed to be different on each
> broker restart?
> 

Yes a new broker epoch implies that there has been a new record created in 
the metadata log.  Therefore the last committed offset must be different.

>
> 55. "Thereafter, it may lose subsequent conflicts if its broker epoch is
> stale.  (See KIP-380 for some background on broker epoch.)  The reason for
> favoring new processes is to accommodate the common case where a process is
> killed with kill -9 and then restarted. " Are you saying that if there is
> an active broker registered in the controller, a new broker heartbeat
> request with the INITIAL state will fence the current broker session? Not
> sure about this. For example, this will allow a broker with incorrectly
> configured broker id to kill an existing valid broker.
> 

Yes, a new broker with an incorrectly configured broker id coul

Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Jason Gustafson
Hey Boyang,

Thanks for the update. This seems like the best thing we can do. The
alternative would be to always ensure that the forwarded APIs are safe for
conversion between versions, but that would restrict the flexibility that
the versioning is providing. It would also be a large effort to avoid
introducing regressions through conversion. Sadly this broadens the scope
of the IBP, but in fact forwarded APIs are inter-broker APIs.

Thanks,
Jason

On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen 
wrote:

> Hey there,
>
> we spotted a necessary case to handle the redirect request versioning, and
> proposed the following changes:
>
> 1. For redirection RPCs (AlterConfig, Acl, Token etc), the corresponding
> allowed versions in the ApiVersionResponse will be affected by the entire
> cluster's versioning, not just the receiving broker, since we need to
> ensure the chosen version get properly handled by all parties. Thus from
> now on, RPC with redirection will be treated as inter-broker RPC, and any
> version bump for these RPCs has to go through IBP bump as well.
> ApiVersionResponse will take IBP into considerations for the redirection
> RPCs allowable versions.
>
> 2. We would do the best effort to maintain the same request version for
> the entire admin client -> receiving broker -> controller broker path, but
> for old RPC versions, they may not have flexible fields introduced yet.
> Thus, we would have to upgrade the RPC to the minimum version which
> supports flexible fields
> and add another tagged field in the header called `OriginalRequestVersion`
> to help the controller broker correctly deserialize the request with the
> original admin client sent out version. We would not downgrade the original
> request in any circumstance, since the flexible field support is required
> to be open-ended on the high side.
>
> Let me know if you have any questions.
>
> Best,
> Boyang
>
> On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen 
> wrote:
>
> > Hey there,
> >
> > we are going to introduce a minor change to bump the version of several
> > RPCs which are currently not supporting flexible versions. It is
> necessary
> > because they need to be able to construct request header with initial
> > principal name and client id as optional fields for redirection. The are
> > only two of them:
> >
> > 1. AlterConfig
> > 2. AlterClientQuotas
> >
> > Let me know if you have any questions.
> >
> > Boyang
> >
> > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen  >
> > wrote:
> >
> >> Hey David,
> >>
> >> After discussing with Colin offline, I would like to correct one case in
> >> the described workflow, where the CLUSTER_ACTION authorization would
> not be
> >> based on the initial principal field check, because it is not a secured
> >> condition which anyone could forge. The revised workflow shall be:
> >>
> >> Step 1. Filter out resources that are authorized
> >>  1.1 Use traditional principals to verify first. If authorized,
> >> continue
> >>  1.2 If not authorized, check whether the request is from the
> >> control plane. Note that this is a best-effort to verify whether the
> >> request is internal.
> >>  1.3 If the request is not from the control plane, return
> >> authorization failure
> >>  1.4 If the request is from the control plane, use
> CLUSTER_ACTION
> >> to verify and determine the result
> >>
> >> Step 2. Check the request context to see if this is a forwarding
> request,
> >> by checking whether it is from control plane and uses extra header
> fields
> >> 2.1 if the resource is authorized, and if this is the active
> >> controller, process it
> >> 2.2 if the resource is authorized but this is not the active
> >> controller, return NOT_CONTROLLER to the sender (forwarding broker) for
> >> retry
> >> 2.3 if the resource is not authorized, return
> >> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the original client
> >> through forwarding broker
> >> Step 3. If the request is not a forwarding request
> >> 3.1 If the resource is authorized, and this is the active
> >> controller, process it
> >> 3.2 If the resource is authorized, but this is not active
> >> controller, put the resource into the preparation for a new AlterConfig
> >> request for forwarding
> >> 3.3 If the resource is not authorized, reply the original client
> >> AUTHORIZATION_FAILURE when the forwarding request is returned
> >>
> >> On Thu, Jul 30, 2020 at 3:47 PM Boyang Chen  >
> >> wrote:
> >>
> >>>
> >>>
> >>> On Thu, Jul 30, 2020 at 7:18 AM David Jacot 
> wrote:
> >>>
>  Hi Boyang,
> 
>  Thanks for your answers.
> 
>  > The point for using the listener name is more of a security purpose,
>  to
>  > detect any forged request to our best effort.
>  > For throttling I think we could just check the request header for
>  > *InitialClientId* existence, to distinguish whether to apply
>  > throttling strategy as forwarded request or 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Jason Gustafson
Hi Justine,

Thanks for picking up this work. I have a few questions/comments:

1. It sounds like the directory structure is still going to be based on
topic names. Do I have that right? One complication is that the
LeaderAndIsr request does not include the topic name any longer. This means
that a replica must wait for the UpdateMetadata request to arrive with the
topic name mapping before it can create the directory. I wonder if it would
be simpler to avoid assumptions on the ordering of UpdateMetadata and let
LeaderAndIsr include the topic name as well. Feels like we are not saving
that much by excluding it.

2. On a related note, it seems that the reason we have the partition
metadata file is because we are not changing the directory structure. We
need it so that we remember which directories map to which topic id. I
think it would be useful to add some detail about why changing the
directory layout was rejected. Basically trying to understand how much
effort we are saving by skipping this step if we have to implement this
additional bookkeeping. One concern I have for example is that the
partition metadata file gets corrupt and then all bets are off as far as
topic consistency.

3. Is it worth adding support now for topic ids in the `Produce` request
now? Overhead is mentioned as one of the motivations and the best APIs to
get savings from are `Produce` and `Fetch`.

4. When it comes to bootstrapping the metadata topic with respect to
KIP-500, one suggestion would be to reserve a sentinel topic ID which can
be used initially by new brokers. When the first controller is elected, it
can assign a topicId to the metadata topic. This is a bit similar to how we
were planning to generate the clusterId.

Thanks,
Jason


On Thu, Sep 24, 2020 at 11:10 AM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the updated KIP. A few more comments below.
>
> 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc should
> say UUID.
>
> 31. LeaderAndIsrResponse v5 and StopReplicaResponse v4 : It seems there is
> no need to add topic_id at partitions level.
>
> 32. Regarding partition metadata file. Perhaps the key can be a single
> word, sth like the following.
> version: 0
> topic_id: 46bdb63f-9e8d-4a38-bf7b-ee4eb2a794e4
>
> 33. Another tricky thing that I realized is how to support the metadata
> topic introduced in KIP-595. It's a bit tricky to assign a UUID to the
> metadata topic since we have a chicken-and-egg problem. The controller
> needs to persist the UUID in the metadata topic in order to assign one
> successfully, but the metadata topic is needed to elect a controller
> (KIP-631). So, this probably needs a bit more thought.
>
> Jun
>
> On Thu, Sep 24, 2020 at 3:04 AM Ismael Juma  wrote:
>
> > Also, can we provide more details on how the Partition Metadata file will
> > be used?
> >
> > Ismael
> >
> > On Thu, Sep 24, 2020 at 3:01 AM Ismael Juma  wrote:
> >
> > > Hi Justine,
> > >
> > > I think we need to update the "Rejected Alternatives" section to take
> > into
> > > account that the proposal now removes the topic name from the fetch
> > > request. Also, if we are removing it from the Fetch request, does it
> make
> > > sense not to remove it from similar requests like ListOffsetRequest?
> > >
> > > Ismael
> > >
> > > On Thu, Sep 24, 2020 at 2:46 AM David Jacot 
> wrote:
> > >
> > >> Hi Justine,
> > >>
> > >> Thanks for the KIP. I finally had time to read it :). It is a great
> > >> improvement.
> > >>
> > >> I have a few comments/questions:
> > >>
> > >> 1. It seems that the schema of the StopReplicaRequest is slightly
> > >> outdated.
> > >> We
> > >> did some changes as part of KIP-570. V3 is already organized by
> topics.
> > >>
> > >> 2. I just want to make sure that I understand the reconciliation
> > >> logic correctly. When
> > >> an "INCREMENTAL" LeaderAndIsr Request is received, the broker will
> also
> > >> reconcile
> > >> when the local uuid does not match the uuid in the request, right? In
> > this
> > >> case, the
> > >> local log is staged for deletion.
> > >>
> > >> 3. In the documentation of the `delete.stale.topic.delay.ms` config,
> it
> > >> says "When a
> > >> FULL LeaderAndIsrRequest is received..." but I suppose it applies to
> > both
> > >> types.
> > >>
> > >> Best,
> > >> David
> > >>
> > >> On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan 
> > >> wrote:
> > >>
> > >> > Hi Jun,
> > >> >
> > >> > Thanks for the comments. I apologize for some of the typos and
> > >> confusion.
> > >> > I’ve updated the KIP to fix some of the issues you mentioned.
> > >> >
> > >> > 20.2 I’ve changed the type to String.
> > >> > 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the
> > >> latest
> > >> > version before this change.
> > >> >
> > >> > 21. You are correct and I’ve removed this line. I’ve also added a
> line
> > >> > mentioning an IBP bump is necessary for migration
> > >> >
> > >> > 22. I think the wording was unclear but your summary is what was
> > >> intended
> 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Justine Olshan
Hi all,

Thanks for the discussion. I'm glad we are able to get our best ideas out
there.

David Jacot
1. I apologize for the incorrect information. I have fixed the KIP.
2. Yes. The difference between full and incremental is that on full we
check for the two types of stale request—topics on the broker that are not
contained in the request, and topics in the request that do not match the
id stored on the broker. In incremental, we can only delete in the second
scenario since not all topics are in the request.
3. Yes we should use delete.stale.topic.delay.ms in both FULL and
INCREMENTAL requests.

I’ve updated the KIP to make some of these things more clear.


Ismael
Removing the topic ID has been requested by a few in this discussion so
I’ve decided to do this. I’ve updated the KIP to reflect this new protocol
and explained the reasoning.

As for partition metadata file. On LeaderAndIsr requests, we will check
this file for all the topic partitions included the request. If the topic
ID in the file does not match the topic ID in the request, it implies that
the local topic partition is stale, as the topic must have been deleted to
create a new topic with a different topic ID. We will mark this topic for
deletion. I’ve updated the KIP to make this more clear in the partition
metadata file section.

Jun
I’ve updated the KIP to fix some of the points you brought up in 30, 31,
and 32. 33 will require a bit more thought, so I will get back to you on
that.

Thanks,
Justine

On Thu, Sep 24, 2020 at 11:10 AM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the updated KIP. A few more comments below.
>
> 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc should
> say UUID.
>
> 31. LeaderAndIsrResponse v5 and StopReplicaResponse v4 : It seems there is
> no need to add topic_id at partitions level.
>
> 32. Regarding partition metadata file. Perhaps the key can be a single
> word, sth like the following.
> version: 0
> topic_id: 46bdb63f-9e8d-4a38-bf7b-ee4eb2a794e4
>
> 33. Another tricky thing that I realized is how to support the metadata
> topic introduced in KIP-595. It's a bit tricky to assign a UUID to the
> metadata topic since we have a chicken-and-egg problem. The controller
> needs to persist the UUID in the metadata topic in order to assign one
> successfully, but the metadata topic is needed to elect a controller
> (KIP-631). So, this probably needs a bit more thought.
>
> Jun
>
> On Thu, Sep 24, 2020 at 3:04 AM Ismael Juma  wrote:
>
> > Also, can we provide more details on how the Partition Metadata file will
> > be used?
> >
> > Ismael
> >
> > On Thu, Sep 24, 2020 at 3:01 AM Ismael Juma  wrote:
> >
> > > Hi Justine,
> > >
> > > I think we need to update the "Rejected Alternatives" section to take
> > into
> > > account that the proposal now removes the topic name from the fetch
> > > request. Also, if we are removing it from the Fetch request, does it
> make
> > > sense not to remove it from similar requests like ListOffsetRequest?
> > >
> > > Ismael
> > >
> > > On Thu, Sep 24, 2020 at 2:46 AM David Jacot 
> wrote:
> > >
> > >> Hi Justine,
> > >>
> > >> Thanks for the KIP. I finally had time to read it :). It is a great
> > >> improvement.
> > >>
> > >> I have a few comments/questions:
> > >>
> > >> 1. It seems that the schema of the StopReplicaRequest is slightly
> > >> outdated.
> > >> We
> > >> did some changes as part of KIP-570. V3 is already organized by
> topics.
> > >>
> > >> 2. I just want to make sure that I understand the reconciliation
> > >> logic correctly. When
> > >> an "INCREMENTAL" LeaderAndIsr Request is received, the broker will
> also
> > >> reconcile
> > >> when the local uuid does not match the uuid in the request, right? In
> > this
> > >> case, the
> > >> local log is staged for deletion.
> > >>
> > >> 3. In the documentation of the `delete.stale.topic.delay.ms` config,
> it
> > >> says "When a
> > >> FULL LeaderAndIsrRequest is received..." but I suppose it applies to
> > both
> > >> types.
> > >>
> > >> Best,
> > >> David
> > >>
> > >> On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan 
> > >> wrote:
> > >>
> > >> > Hi Jun,
> > >> >
> > >> > Thanks for the comments. I apologize for some of the typos and
> > >> confusion.
> > >> > I’ve updated the KIP to fix some of the issues you mentioned.
> > >> >
> > >> > 20.2 I’ve changed the type to String.
> > >> > 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the
> > >> latest
> > >> > version before this change.
> > >> >
> > >> > 21. You are correct and I’ve removed this line. I’ve also added a
> line
> > >> > mentioning an IBP bump is necessary for migration
> > >> >
> > >> > 22. I think the wording was unclear but your summary is what was
> > >> intended
> > >> > by this line. I’ve updated to make this point more clear. “Remove
> > >> deleted
> > >> > topics from replicas by sending StopReplicaRequest V3 before the IBP
> > >> bump
> > >> > using the old logic, and using V4 and the new logic with topic IDs

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Jun Rao
Hi, Justine,

Thanks for the updated KIP. A few more comments below.

30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc should
say UUID.

31. LeaderAndIsrResponse v5 and StopReplicaResponse v4 : It seems there is
no need to add topic_id at partitions level.

32. Regarding partition metadata file. Perhaps the key can be a single
word, sth like the following.
version: 0
topic_id: 46bdb63f-9e8d-4a38-bf7b-ee4eb2a794e4

33. Another tricky thing that I realized is how to support the metadata
topic introduced in KIP-595. It's a bit tricky to assign a UUID to the
metadata topic since we have a chicken-and-egg problem. The controller
needs to persist the UUID in the metadata topic in order to assign one
successfully, but the metadata topic is needed to elect a controller
(KIP-631). So, this probably needs a bit more thought.

Jun

On Thu, Sep 24, 2020 at 3:04 AM Ismael Juma  wrote:

> Also, can we provide more details on how the Partition Metadata file will
> be used?
>
> Ismael
>
> On Thu, Sep 24, 2020 at 3:01 AM Ismael Juma  wrote:
>
> > Hi Justine,
> >
> > I think we need to update the "Rejected Alternatives" section to take
> into
> > account that the proposal now removes the topic name from the fetch
> > request. Also, if we are removing it from the Fetch request, does it make
> > sense not to remove it from similar requests like ListOffsetRequest?
> >
> > Ismael
> >
> > On Thu, Sep 24, 2020 at 2:46 AM David Jacot  wrote:
> >
> >> Hi Justine,
> >>
> >> Thanks for the KIP. I finally had time to read it :). It is a great
> >> improvement.
> >>
> >> I have a few comments/questions:
> >>
> >> 1. It seems that the schema of the StopReplicaRequest is slightly
> >> outdated.
> >> We
> >> did some changes as part of KIP-570. V3 is already organized by topics.
> >>
> >> 2. I just want to make sure that I understand the reconciliation
> >> logic correctly. When
> >> an "INCREMENTAL" LeaderAndIsr Request is received, the broker will also
> >> reconcile
> >> when the local uuid does not match the uuid in the request, right? In
> this
> >> case, the
> >> local log is staged for deletion.
> >>
> >> 3. In the documentation of the `delete.stale.topic.delay.ms` config, it
> >> says "When a
> >> FULL LeaderAndIsrRequest is received..." but I suppose it applies to
> both
> >> types.
> >>
> >> Best,
> >> David
> >>
> >> On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan 
> >> wrote:
> >>
> >> > Hi Jun,
> >> >
> >> > Thanks for the comments. I apologize for some of the typos and
> >> confusion.
> >> > I’ve updated the KIP to fix some of the issues you mentioned.
> >> >
> >> > 20.2 I’ve changed the type to String.
> >> > 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the
> >> latest
> >> > version before this change.
> >> >
> >> > 21. You are correct and I’ve removed this line. I’ve also added a line
> >> > mentioning an IBP bump is necessary for migration
> >> >
> >> > 22. I think the wording was unclear but your summary is what was
> >> intended
> >> > by this line. I’ve updated to make this point more clear. “Remove
> >> deleted
> >> > topics from replicas by sending StopReplicaRequest V3 before the IBP
> >> bump
> >> > using the old logic, and using V4 and the new logic with topic IDs
> after
> >> > the IBP bump.”
> >> >
> >> > 23. I’ve removed the unspecified type from the KIP and mention that
> IBP
> >> > will be used to handle this request. “IBP will be used to determine
> >> whether
> >> > this new form of the request will be used. For older requests, we will
> >> > ignore this field and default to previous behavior.”
> >> >
> >> > 24. I’ve fixed this typo.
> >> >
> >> > 25. I've added a topics at a higher level for LeaderAndIsrResponse v5,
> >> > StopReplicaResponse v4. I've also changed StopReplicaRequest v4 to
> have
> >> > topics at a higher level.
> >> >
> >> > 26. I’ve updated forgotten_topics_data--added the topic ID and removed
> >> the
> >> > topic name
> >> >
> >> > 27. I’ve decided on plain text, and I’ve added an example.
> >> >
> >> > 28. I’ve added this idea to future work.
> >> >
> >> > Thanks again for taking a look,
> >> >
> >> > Justine
> >> >
> >> > On Wed, Sep 23, 2020 at 10:28 AM Jun Rao  wrote:
> >> >
> >> > > Hi, Justine,
> >> > >
> >> > > Thanks for the response. Made another pass. A few more comments
> below.
> >> > >
> >> > > 20. znode schema:
> >> > > 20.1 It seems that {"name": "version", "type": "int", "id": "UUID",
> >> > "doc":
> >> > > "version id"} should be {"name": "version", "type": "int"}, {"name":
> >> > "id",
> >> > > "type": "UUID", "doc": "version id"}.
> >> > > 20.2 The znode format is JSON which doesn't have UUID type. So the
> >> type
> >> > > probably should be string?
> >> > > 20.3 Also, the existing format used seems outdated. It should have
> the
> >> > > following format.
> >> > > Json.encodeAsBytes(Map(
> >> > >   "version" -> 2,
> >> > >   "partitions" -> replicaAssignmentJson.asJava,
> >> > >   "adding_replicas" -> addingRep

Re: [DISCUSS] Apache Kafka 2.7.0 release

2020-09-24 Thread Bill Bejeck
Hi All,

Just a reminder that the KIP freeze is next Wednesday, September 30th.  Any
KIP aiming to go in the 2.7.0 release needs to be accepted by this date.

Thanks,
BIll

On Tue, Sep 22, 2020 at 12:11 PM Bill Bejeck  wrote:

> Boyan,
>
> Done. Thanks for the heads up.
>
> -Bill
>
> On Mon, Sep 21, 2020 at 6:36 PM Boyang Chen 
> wrote:
>
>> Hey Bill,
>>
>> unfortunately KIP-590 will not be in 2.7 release, could you move it to
>> postponed KIPs?
>>
>> Best,
>> Boyang
>>
>> On Thu, Sep 10, 2020 at 2:41 PM Bill Bejeck  wrote:
>>
>> > Hi Gary,
>> >
>> > It's been added.
>> >
>> > Regards,
>> > Bill
>> >
>> > On Thu, Sep 10, 2020 at 4:14 PM Gary Russell 
>> wrote:
>> >
>> > > Can someone add a link to the release plan page [1] to the Future
>> > Releases
>> > > page [2]?
>> > >
>> > > I have the latter bookmarked.
>> > >
>> > > Thanks.
>> > >
>> > > [1]:
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158872629
>> > > [2]:
>> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
>> > > 
>> > > From: Bill Bejeck 
>> > > Sent: Wednesday, September 9, 2020 4:35 PM
>> > > To: dev 
>> > > Subject: Re: [DISCUSS] Apache Kafka 2.7.0 release
>> > >
>> > > Hi Dongjin,
>> > >
>> > > I've moved both KIPs to the release plan.
>> > >
>> > > Keep in mind the cutoff for KIP acceptance is September 30th. If the
>> KIP
>> > > discussions are completed, I'd recommend starting a vote for them.
>> > >
>> > > Regards,
>> > > Bill
>> > >
>> > > On Wed, Sep 9, 2020 at 8:39 AM Dongjin Lee 
>> wrote:
>> > >
>> > > > Hi Bill,
>> > > >
>> > > > Could you add the following KIPs to the plan?
>> > > >
>> > > > - KIP-508: Make Suppression State Queriable
>> > > > <
>> > > >
>> > >
>> >
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-508%253A%2BMake%2BSuppression%2BState%2BQueriable&data=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436&sdata=CkJill9%2FuBqp2HdVQrIjElj2z1nMgQXRaUyWrvY94dk%3D&reserved=0
>> > > > >
>> > > > - KIP-653: Upgrade log4j to log4j2
>> > > > <
>> > > >
>> > >
>> >
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-653%253A%2BUpgrade%2Blog4j%2Bto%2Blog4j2&data=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436&sdata=nHbw6WiQpkWT3KgPfanEtDCh3sWcL0O%2By8Fu0Bl4ivc%3D&reserved=0
>> > > > >
>> > > >
>> > > > Both KIPs are completely implemented with passing all tests, but not
>> > got
>> > > > reviewed by the committers. Could anyone have a look?
>> > > >
>> > > > Thanks,
>> > > > Dongjin
>> > > >
>> > > > On Wed, Sep 9, 2020 at 8:38 AM Leah Thomas 
>> > wrote:
>> > > >
>> > > > > Hi Bill,
>> > > > >
>> > > > > Could you also add KIP-450 to the release plan? It's been merged.
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-450%253A%2BSliding%2BWindow%2BAggregations%2Bin%2Bthe%2BDSL&data=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436&sdata=1KbAPyL7NKWQSZWBKItUTpAJF5SY6%2FMCj8Rn%2Fw2qO20%3D&reserved=0
>> > > > >
>> > > > > Cheers,
>> > > > > Leah
>> > > > >
>> > > > > On Tue, Sep 8, 2020 at 9:32 AM Bill Bejeck 
>> > wrote:
>> > > > >
>> > > > > > Hi Bruno,
>> > > > > >
>> > > > > > Thanks for letting me know, I've added KIP-662 to the release
>> plan.
>> > > > > >
>> > > > > > -Bill
>> > > > > >
>> > > > > > On Mon, Sep 7, 2020 at 11:33 AM Bruno Cadonna <
>> br...@confluent.io>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hi Bill,
>> > > > > > >
>> > > > > > > Could you add KIP-662 [1] to the release plan. The KIP has
>> been
>> > > > already
>> > > > > > > implemented.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > Bruno
>> > > > > > >
>> > > > > > >
>> > > > > > > [1]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-662%253A%2BThrow%2BException%2Bwhen%2BSource%2BTopics%2Bof%2Ba%2BStreams%2BApp%2Bare%2BDeleted&data=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436&sdata=cxbyU9BJuJkM2JJ6yqfr5dHXrg7Mfr1%2BOKxCy%2FJQiCw%3D&reserved=0
>> > > > > > >
>> > > > > > > On 26.08.20 16:54, Bill Bejeck wrote:
>> > > > > > > > Greetings All!
>> > > > > > > >
>> > > > > > > > I've published a release plan at
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D158872629&da

Re: [VOTE] KIP-478 Strongly Typed Streams Processor API

2020-09-24 Thread Matthias J. Sax
Interesting proposal. However, I am not totally convinced, because I see
a fundamental difference between "data" and "metadata".

Topic/partition/offset are "metadata" in the strong sense and they are
immutable.

On the other hand there is "primary" data like key and value, as well as
"secondary" data like timestamp and headers. The issue seems that we
treat "secondary data" more like metadata atm?

Thus, promoting timestamp and headers into a first class citizen roll
make sense to me (my original proposal about `RecordContext` would still
fall short with this regard). However, putting both (data and metadata)
into a `Record` abstraction might go too far?

I am also a little bit concerned about `Record.copy()` because it might
be a trap: Users might assume it does a full deep copy of the record,
however, it would not. It would only create a new `Record` object as
wrapper that points to the same key/value/header objects as the input
record.

With the current `context.forward(key, value)` we don't have this "deep
copy" issue -- it's pretty clear what is happening.

Instead of `To.all().withTimestamp()` we could also add
`context.forward(key, value, timestamp)` etc (just wondering about the
exposition in overload)?

Also, `Record.withValue` etc sounds odd? Should a record not be
immutable? So, we could have something like

`RecordFactory.withKeyValue(...).withTimestamp(...).withHeaders(...).build()`.
But it looks rather verbose?

The other question is of course, to what extend to we want to keep the
distinction between "primary" and "secondary" data? To me, it's a
question of easy of use?

Just putting all this out to move the discussion forward. Don't have a
concrete proposal atm.


-Matthias


On 9/14/20 9:24 AM, John Roesler wrote:
> Thanks for this thought, Matthias!
> 
> To be honest, it's bugged me quite a bit that _all_ the
> record information hasn't been an argument to `process`. I
> suppose I was trying to be conservative in this proposal,
> but then again, if we're adding new Processor and
> ProcessorContext interfaces, then this is the time to make
> such a change.
> 
> To be unambiguous, I think this is what we're talking about:
> ProcessorContext:
> * applicationId
> * taskId
> * appConfigs
> * appConfigsWithPrefix
> * keySerde
> * valueSerde
> * stateDir
> * metrics
> * schedule
> * commit
> * forward
> 
> StateStoreContext:
> * applicationId
> * taskId
> * appConfigs
> * appConfigsWithPrefix
> * keySerde
> * valueSerde
> * stateDir
> * metrics
> * register
> 
> 
> RecordContext
> * topic
> * partition
> * offset
> * timestamp
> * headers
> 
> 
> Your proposal sounds good to me as-is. Just to cover the
> bases, though, I'm wondering if we should push the idea just
> a little farther. Instead of decomposing key,value,context,
> we could just keep them all in one object, like this:
> 
> Record:
> * key
> * value
> * topic
> * partition
> * offset
> * timestamp
> * headers
> 
> Then, we could have:
> Processor#process(Record)
> ProcessorContext#forward(Record, To)
> 
> Viewed from this perspective, a record has three properties
> that people may specify in their processors: key, value, and
> timestamp.
> 
> We could deprecate `To#withTimestamp` and enable people to
> specify the timestamp along with the key and value when they
> forward a record.
> 
> E.g.,
> RecordBuilder toForward = RecordBuilder.copy(record)
> toForward.withKey(newKey)
> toForward.withValue(newValue)
> toForward.withTimestamp(newTimestamp)
> Record newRecord = toForward.build()
> context.forward(newRecord, To.child("child1"))
> 
> Or, the more compact common case:
> current:
>  context.forward(key, "newValue")
> proposed:
>  context.forward(copy(record).withValue("newValue").build())
> 
> 
> It's slightly more verbose, but also more extensible. This
> would give us a clean path to add header support in PAPI as
> well, simply by adding `withHeaders` in RecordBuilder.
> 
> It's also more symmetrical, since the recipient of `forward`
> would just get the sent `Record`. Whereas today, the sender
> puts the timestamp in `To`, but the recipient gets in in its
> own `ProcessorContext`.
> 
> WDYT?
> -John
> 
> On Fri, 2020-09-11 at 12:30 -0700, Matthias J. Sax wrote:
>> I think separating the different contexts make sense.
>>
>> In fact, we could even go one step further and remove the record context
>> from the processor context completely and we add a third parameter to
>> `process(key, value, recordContext)`. This would make it clear that the
>> context is for the input record only and it's not possible to pass it to
>> a `punctuate` callback.
>>
>> For the stores and changelogging: I think there are two cases. (1) You
>> use a plain key-value store. For this case, it seems you do not care
>> about the timestamp and thus does not care what timestamp is set in the
>> changelog records. (We can set anything we want, as it's not relevant at
>> all -- the timestamp is ignored on read anyway.) (2) The other case is,
>> that one does care

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #85

2020-09-24 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-9627: Replace ListOffset request/response with automated 
protocol (#8295)

[github] MINOR: clarify variables for skipping idempotent source updates (#9316)


--
[...truncated 3.32 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTi

Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-09-24 Thread Boyang Chen
Hey there,

we spotted a necessary case to handle the redirect request versioning, and
proposed the following changes:

1. For redirection RPCs (AlterConfig, Acl, Token etc), the corresponding
allowed versions in the ApiVersionResponse will be affected by the entire
cluster's versioning, not just the receiving broker, since we need to
ensure the chosen version get properly handled by all parties. Thus from
now on, RPC with redirection will be treated as inter-broker RPC, and any
version bump for these RPCs has to go through IBP bump as well.
ApiVersionResponse will take IBP into considerations for the redirection
RPCs allowable versions.

2. We would do the best effort to maintain the same request version for
the entire admin client -> receiving broker -> controller broker path, but
for old RPC versions, they may not have flexible fields introduced yet.
Thus, we would have to upgrade the RPC to the minimum version which
supports flexible fields
and add another tagged field in the header called `OriginalRequestVersion`
to help the controller broker correctly deserialize the request with the
original admin client sent out version. We would not downgrade the original
request in any circumstance, since the flexible field support is required
to be open-ended on the high side.

Let me know if you have any questions.

Best,
Boyang

On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen 
wrote:

> Hey there,
>
> we are going to introduce a minor change to bump the version of several
> RPCs which are currently not supporting flexible versions. It is necessary
> because they need to be able to construct request header with initial
> principal name and client id as optional fields for redirection. The are
> only two of them:
>
> 1. AlterConfig
> 2. AlterClientQuotas
>
> Let me know if you have any questions.
>
> Boyang
>
> On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen 
> wrote:
>
>> Hey David,
>>
>> After discussing with Colin offline, I would like to correct one case in
>> the described workflow, where the CLUSTER_ACTION authorization would not be
>> based on the initial principal field check, because it is not a secured
>> condition which anyone could forge. The revised workflow shall be:
>>
>> Step 1. Filter out resources that are authorized
>>  1.1 Use traditional principals to verify first. If authorized,
>> continue
>>  1.2 If not authorized, check whether the request is from the
>> control plane. Note that this is a best-effort to verify whether the
>> request is internal.
>>  1.3 If the request is not from the control plane, return
>> authorization failure
>>  1.4 If the request is from the control plane, use CLUSTER_ACTION
>> to verify and determine the result
>>
>> Step 2. Check the request context to see if this is a forwarding request,
>> by checking whether it is from control plane and uses extra header fields
>> 2.1 if the resource is authorized, and if this is the active
>> controller, process it
>> 2.2 if the resource is authorized but this is not the active
>> controller, return NOT_CONTROLLER to the sender (forwarding broker) for
>> retry
>> 2.3 if the resource is not authorized, return
>> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the original client
>> through forwarding broker
>> Step 3. If the request is not a forwarding request
>> 3.1 If the resource is authorized, and this is the active
>> controller, process it
>> 3.2 If the resource is authorized, but this is not active
>> controller, put the resource into the preparation for a new AlterConfig
>> request for forwarding
>> 3.3 If the resource is not authorized, reply the original client
>> AUTHORIZATION_FAILURE when the forwarding request is returned
>>
>> On Thu, Jul 30, 2020 at 3:47 PM Boyang Chen 
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 30, 2020 at 7:18 AM David Jacot  wrote:
>>>
 Hi Boyang,

 Thanks for your answers.

 > The point for using the listener name is more of a security purpose,
 to
 > detect any forged request to our best effort.
 > For throttling I think we could just check the request header for
 > *InitialClientId* existence, to distinguish whether to apply
 > throttling strategy as forwarded request or direct request.

 Reading "security" and "best effort" in the same sentence makes me a
 little nervous :).

 The identification issue is also valid for quota as we don't want one
 to be
 able to bypass the quota by forging a request as well, isn't it?
 Otherwise,
 anyone could just set the InitialPrincipal to bypass it. I think that we
 should
 only use InitialPrincipal and/or InitialClientId when we know that they
 come
 from another broker. Based on what I read in the KIP, it looks like we
 could
 only use them when the principal has CLUSTER_ACTION privilege. Do I
 understand it correctly?

 There is no 100% safe way to distinguish between raw

[jira] [Created] (KAFKA-10524) Manpage help for kafka-configs.sh does not fully explain broker-loggers

2020-09-24 Thread Daniel Laing (Jira)
Daniel Laing created KAFKA-10524:


 Summary: Manpage help for kafka-configs.sh does not fully explain 
broker-loggers
 Key: KAFKA-10524
 URL: https://issues.apache.org/jira/browse/KAFKA-10524
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.2.0, 2.3.0, 2.2.1, 2.4.0
Reporter: Daniel Laing
Assignee: Viktor Somogyi-Vass


Please see the example app to reproduce the issue: 
https://github.com/gaborgsomogyi/kafka-topic-stress

ZKUtils is deprecated from Kafka version 2.0.0 but there is no real alternative.
* deleteTopics doesn't wait
* ZookeeperClient has "private [kafka]" visibility




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10523) Allow to provide producer ID

2020-09-24 Thread Luigi Berrettini (Jira)
Luigi Berrettini created KAFKA-10523:


 Summary: Allow to provide producer ID
 Key: KAFKA-10523
 URL: https://issues.apache.org/jira/browse/KAFKA-10523
 Project: Kafka
  Issue Type: New Feature
Reporter: Luigi Berrettini


I read about the implementation of idempotence and saw that it is only 
guaranteed within a producer session, since it depends on a PID reassigned 
every time the producer (re)start.

The PID is probably assigne relying on ZooKeeper, but I was wondering if it 
could be possible to support providing a PID externally to gain idempotence 
across restrarts e.g. having the producing application read the PID from a 
configuration file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10522) Duplicate detection and max.in.flight.requests.per.connection details

2020-09-24 Thread Luigi Berrettini (Jira)
Luigi Berrettini created KAFKA-10522:


 Summary: Duplicate detection and 
max.in.flight.requests.per.connection details
 Key: KAFKA-10522
 URL: https://issues.apache.org/jira/browse/KAFKA-10522
 Project: Kafka
  Issue Type: Wish
Reporter: Luigi Berrettini


I was looking at [https://github.com/apache/kafka/pull/3743] and I was 
wondering if you could help me understand it better.

I saw that the 
[Sender|https://github.com/apache/kafka/blob/2.6.0/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L602]
 checks for a {{Errors.DUPLICATE_SEQUENCE_NUMBER}} but I was not able to find 
where this error is triggered on the server side.

It seems to me that duplicate detection relies on checking if the sequence 
number is more than {{lastPersistedSeq + 1}}.

If this is the case:
 * why storing the metadata for the last batches and not just relying on the 
sequence number of the last message persisted in the log?
 * why limiting {{max.in.flight.requests.per.connection}} to a maximun value of 
5 if duplicates are still detected when metadata is not found (and therefore 
with any number of max in flights)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Why should SASL principal be unchanged upon reauth

2020-09-24 Thread Gokul Ramanan Subramanian
Hi.

I was looking through Kafka code and found that SASL KafkaPrincipals are
not supposed to change upon reauthentication, and if they do, the broker
will kill the TCP connection.

What is the reasoning behind this limitation?

Thanks.


[VOTE] KIP-661: Expose task configurations in Connect REST API

2020-09-24 Thread Mickael Maison
Hi,

I'd like to start a vote on KIP-661:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-661%3A+Expose+task+configurations+in+Connect+REST+API

Thanks


Re: [DISCUSS] KIP-671: Shutdown Streams Application when appropriate exception is thrown

2020-09-24 Thread John Roesler
Hello all,

Thanks for bringing this up, Bruno. It’s a really good point that a 
disconnected node would miss the signal and then resurrect a single-node 
“zombie cluster” when it reconnects.

Offhand, I can’t think of a simple and reliable way to distinguish this case 
from one in which an operator starts a node manually after a prior shutdown 
signal. Can you? Right now, I’m inclined to agree with Walker that we should 
leave this as a problem for the future. 

It should certainly be mentioned in the kip, and it also deserves special 
mention in our javadoc and html docs for this feature. 

Thanks!
John

On Wed, Sep 23, 2020, at 17:49, Walker Carlson wrote:
> Bruno,
> 
> I think that we can't guarantee that the message will get
> propagated perfectly in every case of, say network partitioning, though it
> will work for many cases. So I would say it's best effort and I will
> mention it in the kip.
> 
> As for when to use it I think we can discuss if this will be
> sufficient when we come to it, as long as we document its capabilities.
> 
> I hope this answers your question,
> 
> Walker
> 
> On Tue, Sep 22, 2020 at 12:33 AM Bruno Cadonna  wrote:
> 
> > Walker,
> >
> > I am sorry, but I still have a comment on the KIP although you have
> > already started voting.
> >
> > What happens when a consumer of the group skips the rebalancing that
> > propagates the shutdown request? Do you give a guarantee that all Kafka
> > Streams clients are shutdown or is it best effort? If it is best effort,
> > I guess the proposed method might not be used in critical cases where
> > stopping record consumption may prevent or limit damage. I am not saying
> > that it must be a guarantee, but this question should be answered in the
> > KIP, IMO.
> >
> > Best,
> > Bruno
> >
> > On 22.09.20 01:14, Walker Carlson wrote:
> > > The error code right now is the assignor error, 2 is coded for shutdown
> > > but it could be expanded to encode the causes or for other errors that
> > need
> > > to be communicated. For example we can add error code 3 to close the
> > thread
> > > but leave the client in an error state if we choose to do so in the
> > future.
> > >
> > > On Mon, Sep 21, 2020 at 3:43 PM Boyang Chen 
> > > wrote:
> > >
> > >> Thanks for the KIP Walker.
> > >>
> > >> In the KIP we mentioned "In order to communicate the shutdown request
> > from
> > >> one client to the others we propose to update the SubcriptionInfoData to
> > >> include a short field which will encode an error code.", is there a
> > >> dedicated error code that we should define here, or it is case-by-case?
> > >>
> > >> On Mon, Sep 21, 2020 at 1:38 PM Walker Carlson 
> > >> wrote:
> > >>
> > >>> I am changing the name to "Add method to Shutdown entire Streams
> > >>> Application" since we are no longer using an Exception, it seems more
> > >>> appropriate.
> > >>>
> > >>> Also it looks like the discussion is pretty much finished so I will be
> > >>> calling it to a vote.
> > >>>
> > >>> thanks,
> > >>> Walker
> > >>>
> > >>> On Sat, Sep 19, 2020 at 7:49 PM Guozhang Wang 
> > >> wrote:
> > >>>
> >  Sounds good to me. I also feel that this call should be non-blocking
> > >> but
> > >>> I
> >  guess I was confused from the discussion thread that the API is
> > >> designed
> > >>> in
> >  a blocking fashion which contradicts with my perspective and hence I
> > >>> asked
> >  for clarification :)
> > 
> >  Guozhang
> > 
> > 
> >  On Wed, Sep 16, 2020 at 12:32 PM Walker Carlson <
> > wcarl...@confluent.io
> > >>>
> >  wrote:
> > 
> > > Hello Guozhang,
> > >
> > > As for the logging I plan on having three logs. First, the client log
> >  that
> > > it is requesting an application shutdown, second, the leader log
> >  processId
> > > of the invoker, third, then the StreamRebalanceListener it logs that
> > >> it
> >  is
> > > closing because of an `stream.appShutdown`. Hopefully this will be
> > >>> enough
> > > to make the cause of the close clear.
> > >
> > > I see what you mean about the name being dependent on the behavior of
> > >>> the
> > > method so I will try to clarify.  This is how I currently envision
> > >> the
> >  call
> > > working.
> > >
> > > It is not an option to directly initiate a shutdown through a
> >  StreamThread
> > > object from a KafkaStreams object because "KafkaConsumer is not safe
> > >>> for
> > > multi-threaded access". So how it works is that the method in
> >  KafkaStreams
> > > finds the first alive thread and sets a flag in the StreamThread. The
> > > StreamThread will receive the flag in its runloop then set the error
> > >>> code
> > > and trigger a rebalance, afterwards it will stop processing. After
> > >> the
> > > KafkaStreams has set the flag it will return true and continue
> > >> running.
> >  If
> > > there are no alive threads the shutdown will fail and return false.
>

Jenkins build is back to normal : Kafka » kafka-trunk-jdk11 #85

2020-09-24 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : Kafka » kafka-trunk-jdk15 #85

2020-09-24 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-9629) Replace Fetch request/response with automated protocol

2020-09-24 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-9629.
---
Fix Version/s: 2.7.0
   Resolution: Fixed

> Replace Fetch request/response with automated protocol
> --
>
> Key: KAFKA-9629
> URL: https://issues.apache.org/jira/browse/KAFKA-9629
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: David Arthur
>Priority: Major
> Fix For: 2.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10521) Remove ZK watch for completing partition reassignment

2020-09-24 Thread David Arthur (Jira)
David Arthur created KAFKA-10521:


 Summary: Remove ZK watch for completing partition reassignment
 Key: KAFKA-10521
 URL: https://issues.apache.org/jira/browse/KAFKA-10521
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 2.7.0
Reporter: David Arthur


This is a follow-on from KAFKA-8836.

Currently we have a ZK watch on the partition "/state" znode which fires a 
handler in the controller to check if a reassignment can be completed due to 
replicas catching up with the leader. This is located in 
KafkaController#processPartitionReassignmentIsrChange

Following the change to updating ISR with the new AlterIsr RPC, we would like 
to remove this ZK watch and replace it with a direct call when writing out a 
new ISR in the controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #84

2020-09-24 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Use JUnit 5 in raft module (#9331)

[github] MINOR: Remove `zipWithIndex` to avoid tuple allocation in hot path in 
`LogValidator` (#9206)


--
[...truncated 3.32 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualFo

[jira] [Created] (KAFKA-10520) InitProducerId may be blocked if least loaded node is not ready to send

2020-09-24 Thread Rajini Sivaram (Jira)
Rajini Sivaram created KAFKA-10520:
--

 Summary: InitProducerId may be blocked if least loaded node is not 
ready to send
 Key: KAFKA-10520
 URL: https://issues.apache.org/jira/browse/KAFKA-10520
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Reporter: Rajini Sivaram
 Fix For: 2.7.0


>From the logs of a failing producer that shows InitProducerId timing out after 
>request timeout, it looks like we don't poll while waiting for transactional 
>producer to be initialized and FindCoordinator request cannot be sent. The 
>producer configuration used one bootstrap server and 
>{color:#172b4d}`{color}{color:#067d17}{color:#172b4d}max.in.flight.requests.per.connection=1`.
> The failing sequence:{color}{color}
 # {color:#067d17}{color:#172b4d}Producer sends MetadataRequest to least loaded 
node (bootstrap server){color}
{color}
 # {color:#067d17}{color:#172b4d}Producer is ready to send InitProducerId, 
needs to find transaction coordinator{color}{color}
 # {color:#067d17}{color:#172b4d}Producer creates FindCoordinator request, but 
the only node known is the bootstrap server. Producer cannot send to this node 
since there is already the Metadata request in flight and max.inflight is 
1.{color}{color}
 # {color:#067d17}{color:#172b4d}Producer waits without polling, so Metadata 
response is not processed. InitProducerId times out eventually.{color}{color}

 

{color:#067d17}{color:#172b4d}We need to update the condition used to determine 
whether Sender should poll() to fix this issue.{color}{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Ismael Juma
Also, can we provide more details on how the Partition Metadata file will
be used?

Ismael

On Thu, Sep 24, 2020 at 3:01 AM Ismael Juma  wrote:

> Hi Justine,
>
> I think we need to update the "Rejected Alternatives" section to take into
> account that the proposal now removes the topic name from the fetch
> request. Also, if we are removing it from the Fetch request, does it make
> sense not to remove it from similar requests like ListOffsetRequest?
>
> Ismael
>
> On Thu, Sep 24, 2020 at 2:46 AM David Jacot  wrote:
>
>> Hi Justine,
>>
>> Thanks for the KIP. I finally had time to read it :). It is a great
>> improvement.
>>
>> I have a few comments/questions:
>>
>> 1. It seems that the schema of the StopReplicaRequest is slightly
>> outdated.
>> We
>> did some changes as part of KIP-570. V3 is already organized by topics.
>>
>> 2. I just want to make sure that I understand the reconciliation
>> logic correctly. When
>> an "INCREMENTAL" LeaderAndIsr Request is received, the broker will also
>> reconcile
>> when the local uuid does not match the uuid in the request, right? In this
>> case, the
>> local log is staged for deletion.
>>
>> 3. In the documentation of the `delete.stale.topic.delay.ms` config, it
>> says "When a
>> FULL LeaderAndIsrRequest is received..." but I suppose it applies to both
>> types.
>>
>> Best,
>> David
>>
>> On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan 
>> wrote:
>>
>> > Hi Jun,
>> >
>> > Thanks for the comments. I apologize for some of the typos and
>> confusion.
>> > I’ve updated the KIP to fix some of the issues you mentioned.
>> >
>> > 20.2 I’ve changed the type to String.
>> > 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the
>> latest
>> > version before this change.
>> >
>> > 21. You are correct and I’ve removed this line. I’ve also added a line
>> > mentioning an IBP bump is necessary for migration
>> >
>> > 22. I think the wording was unclear but your summary is what was
>> intended
>> > by this line. I’ve updated to make this point more clear. “Remove
>> deleted
>> > topics from replicas by sending StopReplicaRequest V3 before the IBP
>> bump
>> > using the old logic, and using V4 and the new logic with topic IDs after
>> > the IBP bump.”
>> >
>> > 23. I’ve removed the unspecified type from the KIP and mention that IBP
>> > will be used to handle this request. “IBP will be used to determine
>> whether
>> > this new form of the request will be used. For older requests, we will
>> > ignore this field and default to previous behavior.”
>> >
>> > 24. I’ve fixed this typo.
>> >
>> > 25. I've added a topics at a higher level for LeaderAndIsrResponse v5,
>> > StopReplicaResponse v4. I've also changed StopReplicaRequest v4 to have
>> > topics at a higher level.
>> >
>> > 26. I’ve updated forgotten_topics_data--added the topic ID and removed
>> the
>> > topic name
>> >
>> > 27. I’ve decided on plain text, and I’ve added an example.
>> >
>> > 28. I’ve added this idea to future work.
>> >
>> > Thanks again for taking a look,
>> >
>> > Justine
>> >
>> > On Wed, Sep 23, 2020 at 10:28 AM Jun Rao  wrote:
>> >
>> > > Hi, Justine,
>> > >
>> > > Thanks for the response. Made another pass. A few more comments below.
>> > >
>> > > 20. znode schema:
>> > > 20.1 It seems that {"name": "version", "type": "int", "id": "UUID",
>> > "doc":
>> > > "version id"} should be {"name": "version", "type": "int"}, {"name":
>> > "id",
>> > > "type": "UUID", "doc": "version id"}.
>> > > 20.2 The znode format is JSON which doesn't have UUID type. So the
>> type
>> > > probably should be string?
>> > > 20.3 Also, the existing format used seems outdated. It should have the
>> > > following format.
>> > > Json.encodeAsBytes(Map(
>> > >   "version" -> 2,
>> > >   "partitions" -> replicaAssignmentJson.asJava,
>> > >   "adding_replicas" -> addingReplicasAssignmentJson.asJava,
>> > >   "removing_replicas" -> removingReplicasAssignmentJson.asJava
>> > > ).asJava)
>> > >   }
>> > >
>> > > 21. Migration: The KIP says "The migration process can take place
>> without
>> > > an inter-broker protocol bump, as the format stored in
>> > > /brokers/topics/[topic] will be compatible with older broker
>> versions."
>> > > However, since we are bumping up the version of inter-broker
>> requests, it
>> > > seems that we need to use IBP for migration.
>> > >
>> > > 22. The KIP says "Remove deleted topics from replicas by sending
>> > > StopReplicaRequest V3 for any topics which do not contain a topic ID,
>> and
>> > > V4 for any topics which do contain a topic ID.". However, if we use
>> IBP,
>> > it
>> > > seems that the controller will either send StopReplicaRequest V3
>> > > or StopReplicaRequest V4, but never mixed V3 and V4 for different
>> topics.
>> > > Basically, before the IBP bump, V3 will be used. After the IBP bump,
>> > > topicId will be created and V4 will be used.
>> > >
>> > > 23. Given that we depend on IBP, do we still need "0 UNSPECIFIED"
>> > > in LeaderAndIsr?
>> > >

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread Ismael Juma
Hi Justine,

I think we need to update the "Rejected Alternatives" section to take into
account that the proposal now removes the topic name from the fetch
request. Also, if we are removing it from the Fetch request, does it make
sense not to remove it from similar requests like ListOffsetRequest?

Ismael

On Thu, Sep 24, 2020 at 2:46 AM David Jacot  wrote:

> Hi Justine,
>
> Thanks for the KIP. I finally had time to read it :). It is a great
> improvement.
>
> I have a few comments/questions:
>
> 1. It seems that the schema of the StopReplicaRequest is slightly outdated.
> We
> did some changes as part of KIP-570. V3 is already organized by topics.
>
> 2. I just want to make sure that I understand the reconciliation
> logic correctly. When
> an "INCREMENTAL" LeaderAndIsr Request is received, the broker will also
> reconcile
> when the local uuid does not match the uuid in the request, right? In this
> case, the
> local log is staged for deletion.
>
> 3. In the documentation of the `delete.stale.topic.delay.ms` config, it
> says "When a
> FULL LeaderAndIsrRequest is received..." but I suppose it applies to both
> types.
>
> Best,
> David
>
> On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan 
> wrote:
>
> > Hi Jun,
> >
> > Thanks for the comments. I apologize for some of the typos and confusion.
> > I’ve updated the KIP to fix some of the issues you mentioned.
> >
> > 20.2 I’ve changed the type to String.
> > 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the
> latest
> > version before this change.
> >
> > 21. You are correct and I’ve removed this line. I’ve also added a line
> > mentioning an IBP bump is necessary for migration
> >
> > 22. I think the wording was unclear but your summary is what was intended
> > by this line. I’ve updated to make this point more clear. “Remove deleted
> > topics from replicas by sending StopReplicaRequest V3 before the IBP bump
> > using the old logic, and using V4 and the new logic with topic IDs after
> > the IBP bump.”
> >
> > 23. I’ve removed the unspecified type from the KIP and mention that IBP
> > will be used to handle this request. “IBP will be used to determine
> whether
> > this new form of the request will be used. For older requests, we will
> > ignore this field and default to previous behavior.”
> >
> > 24. I’ve fixed this typo.
> >
> > 25. I've added a topics at a higher level for LeaderAndIsrResponse v5,
> > StopReplicaResponse v4. I've also changed StopReplicaRequest v4 to have
> > topics at a higher level.
> >
> > 26. I’ve updated forgotten_topics_data--added the topic ID and removed
> the
> > topic name
> >
> > 27. I’ve decided on plain text, and I’ve added an example.
> >
> > 28. I’ve added this idea to future work.
> >
> > Thanks again for taking a look,
> >
> > Justine
> >
> > On Wed, Sep 23, 2020 at 10:28 AM Jun Rao  wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the response. Made another pass. A few more comments below.
> > >
> > > 20. znode schema:
> > > 20.1 It seems that {"name": "version", "type": "int", "id": "UUID",
> > "doc":
> > > "version id"} should be {"name": "version", "type": "int"}, {"name":
> > "id",
> > > "type": "UUID", "doc": "version id"}.
> > > 20.2 The znode format is JSON which doesn't have UUID type. So the type
> > > probably should be string?
> > > 20.3 Also, the existing format used seems outdated. It should have the
> > > following format.
> > > Json.encodeAsBytes(Map(
> > >   "version" -> 2,
> > >   "partitions" -> replicaAssignmentJson.asJava,
> > >   "adding_replicas" -> addingReplicasAssignmentJson.asJava,
> > >   "removing_replicas" -> removingReplicasAssignmentJson.asJava
> > > ).asJava)
> > >   }
> > >
> > > 21. Migration: The KIP says "The migration process can take place
> without
> > > an inter-broker protocol bump, as the format stored in
> > > /brokers/topics/[topic] will be compatible with older broker versions."
> > > However, since we are bumping up the version of inter-broker requests,
> it
> > > seems that we need to use IBP for migration.
> > >
> > > 22. The KIP says "Remove deleted topics from replicas by sending
> > > StopReplicaRequest V3 for any topics which do not contain a topic ID,
> and
> > > V4 for any topics which do contain a topic ID.". However, if we use
> IBP,
> > it
> > > seems that the controller will either send StopReplicaRequest V3
> > > or StopReplicaRequest V4, but never mixed V3 and V4 for different
> topics.
> > > Basically, before the IBP bump, V3 will be used. After the IBP bump,
> > > topicId will be created and V4 will be used.
> > >
> > > 23. Given that we depend on IBP, do we still need "0 UNSPECIFIED"
> > > in LeaderAndIsr?
> > >
> > > 24. LeaderAndIsrResponse v5 : It still has the topic field.
> > >
> > > 25. LeaderAndIsrResponse v5, StopReplicaResponse v4: Could we use this
> > > opportunity to organize the response in 2 levels, first by topic, then
> by
> > > partition, as most other requests/responses?
> > >
> > > 26. Fetch

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-24 Thread David Jacot
Hi Justine,

Thanks for the KIP. I finally had time to read it :). It is a great
improvement.

I have a few comments/questions:

1. It seems that the schema of the StopReplicaRequest is slightly outdated.
We
did some changes as part of KIP-570. V3 is already organized by topics.

2. I just want to make sure that I understand the reconciliation
logic correctly. When
an "INCREMENTAL" LeaderAndIsr Request is received, the broker will also
reconcile
when the local uuid does not match the uuid in the request, right? In this
case, the
local log is staged for deletion.

3. In the documentation of the `delete.stale.topic.delay.ms` config, it
says "When a
FULL LeaderAndIsrRequest is received..." but I suppose it applies to both
types.

Best,
David

On Thu, Sep 24, 2020 at 1:40 AM Justine Olshan  wrote:

> Hi Jun,
>
> Thanks for the comments. I apologize for some of the typos and confusion.
> I’ve updated the KIP to fix some of the issues you mentioned.
>
> 20.2 I’ve changed the type to String.
> 20.1/3 I’ve updated the TopicZNode to fix formatting and reflect the latest
> version before this change.
>
> 21. You are correct and I’ve removed this line. I’ve also added a line
> mentioning an IBP bump is necessary for migration
>
> 22. I think the wording was unclear but your summary is what was intended
> by this line. I’ve updated to make this point more clear. “Remove deleted
> topics from replicas by sending StopReplicaRequest V3 before the IBP bump
> using the old logic, and using V4 and the new logic with topic IDs after
> the IBP bump.”
>
> 23. I’ve removed the unspecified type from the KIP and mention that IBP
> will be used to handle this request. “IBP will be used to determine whether
> this new form of the request will be used. For older requests, we will
> ignore this field and default to previous behavior.”
>
> 24. I’ve fixed this typo.
>
> 25. I've added a topics at a higher level for LeaderAndIsrResponse v5,
> StopReplicaResponse v4. I've also changed StopReplicaRequest v4 to have
> topics at a higher level.
>
> 26. I’ve updated forgotten_topics_data--added the topic ID and removed the
> topic name
>
> 27. I’ve decided on plain text, and I’ve added an example.
>
> 28. I’ve added this idea to future work.
>
> Thanks again for taking a look,
>
> Justine
>
> On Wed, Sep 23, 2020 at 10:28 AM Jun Rao  wrote:
>
> > Hi, Justine,
> >
> > Thanks for the response. Made another pass. A few more comments below.
> >
> > 20. znode schema:
> > 20.1 It seems that {"name": "version", "type": "int", "id": "UUID",
> "doc":
> > "version id"} should be {"name": "version", "type": "int"}, {"name":
> "id",
> > "type": "UUID", "doc": "version id"}.
> > 20.2 The znode format is JSON which doesn't have UUID type. So the type
> > probably should be string?
> > 20.3 Also, the existing format used seems outdated. It should have the
> > following format.
> > Json.encodeAsBytes(Map(
> >   "version" -> 2,
> >   "partitions" -> replicaAssignmentJson.asJava,
> >   "adding_replicas" -> addingReplicasAssignmentJson.asJava,
> >   "removing_replicas" -> removingReplicasAssignmentJson.asJava
> > ).asJava)
> >   }
> >
> > 21. Migration: The KIP says "The migration process can take place without
> > an inter-broker protocol bump, as the format stored in
> > /brokers/topics/[topic] will be compatible with older broker versions."
> > However, since we are bumping up the version of inter-broker requests, it
> > seems that we need to use IBP for migration.
> >
> > 22. The KIP says "Remove deleted topics from replicas by sending
> > StopReplicaRequest V3 for any topics which do not contain a topic ID, and
> > V4 for any topics which do contain a topic ID.". However, if we use IBP,
> it
> > seems that the controller will either send StopReplicaRequest V3
> > or StopReplicaRequest V4, but never mixed V3 and V4 for different topics.
> > Basically, before the IBP bump, V3 will be used. After the IBP bump,
> > topicId will be created and V4 will be used.
> >
> > 23. Given that we depend on IBP, do we still need "0 UNSPECIFIED"
> > in LeaderAndIsr?
> >
> > 24. LeaderAndIsrResponse v5 : It still has the topic field.
> >
> > 25. LeaderAndIsrResponse v5, StopReplicaResponse v4: Could we use this
> > opportunity to organize the response in 2 levels, first by topic, then by
> > partition, as most other requests/responses?
> >
> > 26. FetchRequest v13 : Should forgotten_topics_data use topicId too?
> >
> > 27. "This file can either be plain text (key/value pairs) or JSON." Have
> we
> > decided which one to use? Also, it would be helpful to provide an
> example.
> >
> > 28. Future improvement: Another future improvement opportunity is to use
> > topicId in GroupMetadataManager.offsetCommitKey in the offset_commit
> topic.
> > This may save some space.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Sep 23, 2020 at 8:50 AM Justine Olshan 
> > wrote:
> >
> > > Hi Tom,
> > >
> > > Thanks for the comment. I think this is a really good idea and i