Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Joel, Aditya, I believe we don't need another thread to do voting since affected items are not related to the core proposed changes. I agree people can explicitly down-vote in case of concerns about the things that changed. Thanks, Andrii Biletskyi On Fri, Jun 12, 2015 at 8:24 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Aside from the things I mentioned, I don't think there were other changes. I'll mark this as adopted since there don't appear to be any concerns. Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Thursday, June 11, 2015 1:28 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Discussion aside, was there any significant material change besides the additions below? If so, then we can avoid the overhead of another vote unless someone wants to down-vote these changes. Joel On Thu, Jun 11, 2015 at 06:36:36PM +, Aditya Auradkar wrote: Andrii, Do we need a new voting thread for this KIP? The last round of votes had 3 binding +1's but there's been a fair amount of discussion since then. Aditya From: Aditya Auradkar Sent: Thursday, June 11, 2015 10:32 AM To: dev@kafka.apache.org Subject: RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) I've made two changes to the document: - Removed the TMR evolution piece since we agreed to retain this. - Added two new API's to the admin client spec. (Alter and Describe config). Please review. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Friday, May 29, 2015 8:36 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) +1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh
RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Aside from the things I mentioned, I don't think there were other changes. I'll mark this as adopted since there don't appear to be any concerns. Aditya From: Joel Koshy [jjkosh...@gmail.com] Sent: Thursday, June 11, 2015 1:28 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Discussion aside, was there any significant material change besides the additions below? If so, then we can avoid the overhead of another vote unless someone wants to down-vote these changes. Joel On Thu, Jun 11, 2015 at 06:36:36PM +, Aditya Auradkar wrote: Andrii, Do we need a new voting thread for this KIP? The last round of votes had 3 binding +1's but there's been a fair amount of discussion since then. Aditya From: Aditya Auradkar Sent: Thursday, June 11, 2015 10:32 AM To: dev@kafka.apache.org Subject: RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) I've made two changes to the document: - Removed the TMR evolution piece since we agreed to retain this. - Added two new API's to the admin client spec. (Alter and Describe config). Please review. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Friday, May 29, 2015 8:36 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) +1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Discussion aside, was there any significant material change besides the additions below? If so, then we can avoid the overhead of another vote unless someone wants to down-vote these changes. Joel On Thu, Jun 11, 2015 at 06:36:36PM +, Aditya Auradkar wrote: Andrii, Do we need a new voting thread for this KIP? The last round of votes had 3 binding +1's but there's been a fair amount of discussion since then. Aditya From: Aditya Auradkar Sent: Thursday, June 11, 2015 10:32 AM To: dev@kafka.apache.org Subject: RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) I've made two changes to the document: - Removed the TMR evolution piece since we agreed to retain this. - Added two new API's to the admin client spec. (Alter and Describe config). Please review. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Friday, May 29, 2015 8:36 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) +1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun
RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
I've made two changes to the document: - Removed the TMR evolution piece since we agreed to retain this. - Added two new API's to the admin client spec. (Alter and Describe config). Please review. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Friday, May 29, 2015 8:36 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) +1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might
RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Andrii, Do we need a new voting thread for this KIP? The last round of votes had 3 binding +1's but there's been a fair amount of discussion since then. Aditya From: Aditya Auradkar Sent: Thursday, June 11, 2015 10:32 AM To: dev@kafka.apache.org Subject: RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) I've made two changes to the document: - Removed the TMR evolution piece since we agreed to retain this. - Added two new API's to the admin client spec. (Alter and Describe config). Please review. Aditya From: Ashish Singh [asi...@cloudera.com] Sent: Friday, May 29, 2015 8:36 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) +1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might not want to create topic with default number of partitions but with lower number of partitions. Currently in a multi-tenant environment this is not possible without changing the cluster wide default config. b) Some clients might want to just check if the topic exist or not but currently the topic gets created automatically using default number of partitions. Here are some ideas to address this : 1) The way this can be addressed is that TopicMetadata request should have a way to specify whether it should only check if the topic exist or check and create a topic with given number of partitions. If the number of partitions is not specified use the default cluster wide config
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
+1 on discussing this on next KIP hangout. I will update KIP-24 before that. On Fri, May 29, 2015 at 3:40 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I won't be able to attend next meeting. But in the latest patch for KIP-4 Phase 1 I didn't even evolve TopicMetadataRequest to v1 since we won't be able to change config with AlterTopicRequest, hence with this patch TMR will still return isr. Taking this into account I think yes - it would be good to fix ISR issue, although I didn't consider it to be a critical one (isr was part of TMR from the very beginning and almost no code relies on this piece of request). Thanks, Andrii Biletskyi On Fri, May 29, 2015 at 8:50 AM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might not want to create topic with default number of partitions but with lower number of partitions. Currently in a multi-tenant environment this is not possible without changing the cluster wide default config. b) Some clients might want to just check if the topic exist or not but currently the topic gets created automatically using default number of partitions. Here are some ideas
RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this during the next hangout? Aditya From: Jun Rao [j...@confluent.io] Sent: Thursday, May 28, 2015 5:32 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) There is a reasonable use case of ISR in KAFKA-2225. Basically, for economical reasons, we may want to let a consumer fetch from a replica in ISR that's in the same zone. In order to support that, it will be convenient to have TMR return the correct ISR for the consumer to choose. So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367 (there is some new discussion there on what it takes to fix this). If we do that, we can leave TMR unchanged. Thanks, Jun On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might not want to create topic with default number of partitions but with lower number of partitions. Currently in a multi-tenant environment this is not possible without changing the cluster wide default config. b) Some clients might want to just check if the topic exist or not but currently the topic gets created automatically using default number of partitions. Here are some ideas to address this : 1) The way this can be addressed is that TopicMetadata request should have a way to specify whether it should only check if the topic exist or check and create a topic with given number of partitions. If the number of partitions is not specified use the default cluster wide config. OR 2) We should only allow TopicMetadata Request to get the metadata explicitly and not allow it to create a new topic. We should have another Request that takes in config parameters from the user regarding how he/she wants the topic to be created. This request can be used if we get an empty TopicMetadata Response. Thanks, Mayuresh On Thu, May 14, 2015 at 10:22 AM, Jun Rao j...@confluent.io wrote: For ListTopics, we decided not to add a ListTopics request for now and just rely on passing in an empty list to TMR. We can revisit this in the future if it becomes an issue. Thanks, Jun On Wed, May 13, 2015 at 3
RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Andryii, I made a few edits to this document as discussed in the KIP-21 thread. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations With these changes. the only difference between TopicMetadataResponse_V1 and V0 is the removal of the ISR field. I've altered the KIP with the assumption that this is a good enough reason by itself to evolve the request/response protocol. Any concerns there? Thanks, Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, May 21, 2015 8:29 PM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2) Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might not want to create topic with default number of partitions but with lower number of partitions. Currently in a multi-tenant environment this is not possible without changing the cluster wide default config. b) Some clients might want to just check if the topic exist or not but currently the topic gets created automatically using default number of partitions. Here are some ideas to address this : 1) The way this can be addressed is that TopicMetadata request should have a way to specify whether it should only check if the topic exist or check and create a topic with given number of partitions. If the number of partitions is not specified use the default cluster wide config. OR 2) We should only allow TopicMetadata Request to get the metadata explicitly and not allow it to create a new topic. We should have another Request that takes in config parameters from the user regarding how he/she wants the topic to be created. This request can be used if we get an empty TopicMetadata Response. Thanks, Mayuresh On Thu, May 14, 2015 at 10:22 AM, Jun Rao j...@confluent.io wrote: For ListTopics, we decided not to add a ListTopics request for now and just rely on passing in an empty list to TMR. We can revisit this in the future if it becomes an issue. Thanks, Jun On Wed, May 13, 2015 at 3:31 PM, Joel Koshy jjkosh...@gmail.com wrote: Just had a few minor questions before I join the vote thread. Apologies if these have been discussed: - Do we need DecreasePartitionsNotAllowed? i.e., can we just return InvalidPartitions instead? - AdminClient.listTopics: should we allow listing all partitions? Or do you intend for the client to issue listTopics followed by describeTopics? - On returning futurevoid for partition reassignments: do we need to return any future especially since you have the verifyReassignPartitions method? For e.g., what happens if the controller moves? The get should fail right? The client will then need to connect to the new controller and reissue the request but will then get ReassignPartitionsInProgress. So in that case the client any way needs to rely in verifyReassignPartitions. - In past hangouts I think either you/Joe were mentioning the need to locate the controller (and possibly other cluster
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Hi Jun, Thanks a lot. I get it now. Point 4) will actually enable clients to who don't want to create a topic with default partitions, if it does not exist and then can manually create the topic with their own configs(#partitions). Thanks, Mayuresh On Thu, May 21, 2015 at 6:16 PM, Jun Rao j...@confluent.io wrote: Mayuresh, The current plan is the following. 1. Add TMR v1, which still triggers auto topic creation. 2. Change the consumer client to TMR v1. Change the producer client to use TMR v1 and on UnknownTopicException, issue TopicCreateRequest to explicitly create the topic with the default server side partitions and replicas. 3. At some later time after the new clients are released and deployed, disable auto topic creation in TMR v1. This will make sure consumers never create new topics. 4. If needed, we can add a new config in the producer to control whether TopicCreateRequest should be issued or not on UnknownTopicException. If this is disabled and the topic doesn't exist, send will fail and the user is expected to create the topic manually. Thanks, Jun On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I had a question about TopicMetadata Request. Currently the way it works is : 1) Suppose a topic T1 does not exist. 2) Client wants to produce data to T1 using producer P1. 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka. This in turn creates the default number of partition. The number of partitions is a cluster wide config. 4) Same goes for a consumer. If the topic does not exist and new topic will be created when the consumer issues TopicMetadata request. Here are 2 use cases where it might not be suited : The auto creation flag for topics is turned ON. a) Some clients might not want to create topic with default number of partitions but with lower number of partitions. Currently in a multi-tenant environment this is not possible without changing the cluster wide default config. b) Some clients might want to just check if the topic exist or not but currently the topic gets created automatically using default number of partitions. Here are some ideas to address this : 1) The way this can be addressed is that TopicMetadata request should have a way to specify whether it should only check if the topic exist or check and create a topic with given number of partitions. If the number of partitions is not specified use the default cluster wide config. OR 2) We should only allow TopicMetadata Request to get the metadata explicitly and not allow it to create a new topic. We should have another Request that takes in config parameters from the user regarding how he/she wants the topic to be created. This request can be used if we get an empty TopicMetadata Response. Thanks, Mayuresh On Thu, May 14, 2015 at 10:22 AM, Jun Rao j...@confluent.io wrote: For ListTopics, we decided not to add a ListTopics request for now and just rely on passing in an empty list to TMR. We can revisit this in the future if it becomes an issue. Thanks, Jun On Wed, May 13, 2015 at 3:31 PM, Joel Koshy jjkosh...@gmail.com wrote: Just had a few minor questions before I join the vote thread. Apologies if these have been discussed: - Do we need DecreasePartitionsNotAllowed? i.e., can we just return InvalidPartitions instead? - AdminClient.listTopics: should we allow listing all partitions? Or do you intend for the client to issue listTopics followed by describeTopics? - On returning futurevoid for partition reassignments: do we need to return any future especially since you have the verifyReassignPartitions method? For e.g., what happens if the controller moves? The get should fail right? The client will then need to connect to the new controller and reissue the request but will then get ReassignPartitionsInProgress. So in that case the client any way needs to rely in verifyReassignPartitions. - In past hangouts I think either you/Joe were mentioning the need to locate the controller (and possibly other cluster metadata). It appears we decided to defer this for a future KIP. Correct? Thanks, Joel On Tue, May 05, 2015 at 04:49:27PM +0300, Andrii Biletskyi wrote: Guys, I've updated the wiki to reflect all previously discussed items (regarding the schema only - this is included to phase 1). https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations I think we can have the final discussion today (for phase 1) and in case no new remarks I will start the voting thread. With regards to AlterTopicRequest semantics. I agree with Jun, and I think it's my bad I focused on multiple topics
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
For ListTopics, we decided not to add a ListTopics request for now and just rely on passing in an empty list to TMR. We can revisit this in the future if it becomes an issue. Thanks, Jun On Wed, May 13, 2015 at 3:31 PM, Joel Koshy jjkosh...@gmail.com wrote: Just had a few minor questions before I join the vote thread. Apologies if these have been discussed: - Do we need DecreasePartitionsNotAllowed? i.e., can we just return InvalidPartitions instead? - AdminClient.listTopics: should we allow listing all partitions? Or do you intend for the client to issue listTopics followed by describeTopics? - On returning futurevoid for partition reassignments: do we need to return any future especially since you have the verifyReassignPartitions method? For e.g., what happens if the controller moves? The get should fail right? The client will then need to connect to the new controller and reissue the request but will then get ReassignPartitionsInProgress. So in that case the client any way needs to rely in verifyReassignPartitions. - In past hangouts I think either you/Joe were mentioning the need to locate the controller (and possibly other cluster metadata). It appears we decided to defer this for a future KIP. Correct? Thanks, Joel On Tue, May 05, 2015 at 04:49:27PM +0300, Andrii Biletskyi wrote: Guys, I've updated the wiki to reflect all previously discussed items (regarding the schema only - this is included to phase 1). https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations I think we can have the final discussion today (for phase 1) and in case no new remarks I will start the voting thread. With regards to AlterTopicRequest semantics. I agree with Jun, and I think it's my bad I focused on multiple topics in one request. The same situation is possible in ProduceRequest, Fetch, TopicMetadata and we handle it naturally and in the most transparent way - we put all separate instructions into a map and thus silently ignore duplicates. This also makes Response part simple too - it's just a map Topic-ErrorCode. I think we need to follow the same approach for Alter (and Create, Delete) request. With this we add nothing new in terms of batch requests semantics. Thanks, Andrii Biletskyi
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Joel, - DecreasePartitionNotAllowed. Yeah, that's kind of subcase of InvalidPartitions... But since client can always request topic metadata and check what exactly is was wrong with Partitions argument, I think, yes, we can remove DecreasePartitionNotAllowed and use InvalidPartitions instead. I'll update KIP accordingly if no objections. - Questions regarding AdminClient. On one of the previous meetings I suggested we wrap up everything in terms of phase 1 (Wire Protocol and message semantics) so AdminClient is out of scope for now. I'll definitely take into account your remarks and suggestions but would rather wait until I finish phase 1 because I believe answers may change by that time. - Yes, correct. Any broker from the cluster will be able to handle Admin requests thus there is no need to add controller discovery info. Maybe it will be part of some separate KIP, as mentioned in KAFKA-1367. Thanks, Andrii Biletskyi On Thu, May 14, 2015 at 1:31 AM, Joel Koshy jjkosh...@gmail.com wrote: Just had a few minor questions before I join the vote thread. Apologies if these have been discussed: - Do we need DecreasePartitionsNotAllowed? i.e., can we just return InvalidPartitions instead? - AdminClient.listTopics: should we allow listing all partitions? Or do you intend for the client to issue listTopics followed by describeTopics? - On returning futurevoid for partition reassignments: do we need to return any future especially since you have the verifyReassignPartitions method? For e.g., what happens if the controller moves? The get should fail right? The client will then need to connect to the new controller and reissue the request but will then get ReassignPartitionsInProgress. So in that case the client any way needs to rely in verifyReassignPartitions. - In past hangouts I think either you/Joe were mentioning the need to locate the controller (and possibly other cluster metadata). It appears we decided to defer this for a future KIP. Correct? Thanks, Joel On Tue, May 05, 2015 at 04:49:27PM +0300, Andrii Biletskyi wrote: Guys, I've updated the wiki to reflect all previously discussed items (regarding the schema only - this is included to phase 1). https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations I think we can have the final discussion today (for phase 1) and in case no new remarks I will start the voting thread. With regards to AlterTopicRequest semantics. I agree with Jun, and I think it's my bad I focused on multiple topics in one request. The same situation is possible in ProduceRequest, Fetch, TopicMetadata and we handle it naturally and in the most transparent way - we put all separate instructions into a map and thus silently ignore duplicates. This also makes Response part simple too - it's just a map Topic-ErrorCode. I think we need to follow the same approach for Alter (and Create, Delete) request. With this we add nothing new in terms of batch requests semantics. Thanks, Andrii Biletskyi
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Just had a few minor questions before I join the vote thread. Apologies if these have been discussed: - Do we need DecreasePartitionsNotAllowed? i.e., can we just return InvalidPartitions instead? - AdminClient.listTopics: should we allow listing all partitions? Or do you intend for the client to issue listTopics followed by describeTopics? - On returning futurevoid for partition reassignments: do we need to return any future especially since you have the verifyReassignPartitions method? For e.g., what happens if the controller moves? The get should fail right? The client will then need to connect to the new controller and reissue the request but will then get ReassignPartitionsInProgress. So in that case the client any way needs to rely in verifyReassignPartitions. - In past hangouts I think either you/Joe were mentioning the need to locate the controller (and possibly other cluster metadata). It appears we decided to defer this for a future KIP. Correct? Thanks, Joel On Tue, May 05, 2015 at 04:49:27PM +0300, Andrii Biletskyi wrote: Guys, I've updated the wiki to reflect all previously discussed items (regarding the schema only - this is included to phase 1). https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations I think we can have the final discussion today (for phase 1) and in case no new remarks I will start the voting thread. With regards to AlterTopicRequest semantics. I agree with Jun, and I think it's my bad I focused on multiple topics in one request. The same situation is possible in ProduceRequest, Fetch, TopicMetadata and we handle it naturally and in the most transparent way - we put all separate instructions into a map and thus silently ignore duplicates. This also makes Response part simple too - it's just a map Topic-ErrorCode. I think we need to follow the same approach for Alter (and Create, Delete) request. With this we add nothing new in terms of batch requests semantics. Thanks, Andrii Biletskyi
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, I've updated the wiki to reflect all previously discussed items (regarding the schema only - this is included to phase 1). https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations I think we can have the final discussion today (for phase 1) and in case no new remarks I will start the voting thread. With regards to AlterTopicRequest semantics. I agree with Jun, and I think it's my bad I focused on multiple topics in one request. The same situation is possible in ProduceRequest, Fetch, TopicMetadata and we handle it naturally and in the most transparent way - we put all separate instructions into a map and thus silently ignore duplicates. This also makes Response part simple too - it's just a map Topic-ErrorCode. I think we need to follow the same approach for Alter (and Create, Delete) request. With this we add nothing new in terms of batch requests semantics. Thanks, Andrii Biletskyi On Thu, Apr 30, 2015 at 4:31 PM, Jun Rao j...@confluent.io wrote: The following is a description of some of my concerns on allowing the same topic multiple times in AlterTopicRequest. ATP has an array of entries, each corresponding to a topic. We allow multiple changes to a topic in a single entry. Those changes may fail to apply independently (e.g., the config change may succeed, but the replica assignment change may fail). If there is an issue applying one of the changes, we will set an error code for that entry in the response. If we allow the same topic to be specified multiple times in ATR, it can happen that the first entry succeeds, but the second entry fails partially. Now, from the admin's perspective, it's a bit hard to do the verification. Ideally, you want to wait for the changes in the first entry to be applied. However, the second entry may have part of the changes applied successfully. About putting restrictions on the requests. Currently, we effectively expect a topic-partition to be only specified once in the FetchRequest. Allowing the same topic-partition to be specified multiple times in FetchRequest will be confusing and complicates the implementation (e.g., putting the request in purgatory). A few other requests probably have similar implicit assumptions on topic or topic-partition being unique in each request. Thanks, Jun On Tue, Apr 28, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, A quick summary of our today's meeting. There were no additional issues/questions. The only item about which we are not 100% sure is multiple instructions for one topic in one request case. It was proposed by Jun to explain reasons behind not allowing users doing that again here in mailing list, and in case we implement it in final version document it well so API clients understand what exactly is not allowed and why. At the meantime I will update the KIP. After that I will start voting thread. Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 10:33 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, It seems that there are no open questions left so prior to our weekly call let me summarize what I'm going to implement as part of phase one for KIP-4. 1. Add 3 new Wire Protocol requests - Create-, Alter- and DeleteTopicRequest 2. Topic requests are batch requests, errors are returned per topic as part of batch response. 3. Topic requests are asynchronous - respective commands are only started and server is not blocked until command is finished. 4. It will be not allowed to specify multiple mutations for the same topic in scope of one batch request - a special error will be returned for such topic. 5. There will be no dedicated request for reassign-partitions - it is simulated with AlterTopicRequest.ReplicaAssignment field. 6. Preferred-replica-leader-election is not supported since there is no need to have a public API to trigger such operation. 7. TopicMetadataReqeust will be evolved to version 1 - topic-level configuration per topic will be included and ISR field will be removed. Automatic topic-creation logic will be removed (we will use CreateTopicRequest for that). Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 12:23 AM, Jun Rao j...@confluent.io wrote: Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though. Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
The following is a description of some of my concerns on allowing the same topic multiple times in AlterTopicRequest. ATP has an array of entries, each corresponding to a topic. We allow multiple changes to a topic in a single entry. Those changes may fail to apply independently (e.g., the config change may succeed, but the replica assignment change may fail). If there is an issue applying one of the changes, we will set an error code for that entry in the response. If we allow the same topic to be specified multiple times in ATR, it can happen that the first entry succeeds, but the second entry fails partially. Now, from the admin's perspective, it's a bit hard to do the verification. Ideally, you want to wait for the changes in the first entry to be applied. However, the second entry may have part of the changes applied successfully. About putting restrictions on the requests. Currently, we effectively expect a topic-partition to be only specified once in the FetchRequest. Allowing the same topic-partition to be specified multiple times in FetchRequest will be confusing and complicates the implementation (e.g., putting the request in purgatory). A few other requests probably have similar implicit assumptions on topic or topic-partition being unique in each request. Thanks, Jun On Tue, Apr 28, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, A quick summary of our today's meeting. There were no additional issues/questions. The only item about which we are not 100% sure is multiple instructions for one topic in one request case. It was proposed by Jun to explain reasons behind not allowing users doing that again here in mailing list, and in case we implement it in final version document it well so API clients understand what exactly is not allowed and why. At the meantime I will update the KIP. After that I will start voting thread. Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 10:33 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, It seems that there are no open questions left so prior to our weekly call let me summarize what I'm going to implement as part of phase one for KIP-4. 1. Add 3 new Wire Protocol requests - Create-, Alter- and DeleteTopicRequest 2. Topic requests are batch requests, errors are returned per topic as part of batch response. 3. Topic requests are asynchronous - respective commands are only started and server is not blocked until command is finished. 4. It will be not allowed to specify multiple mutations for the same topic in scope of one batch request - a special error will be returned for such topic. 5. There will be no dedicated request for reassign-partitions - it is simulated with AlterTopicRequest.ReplicaAssignment field. 6. Preferred-replica-leader-election is not supported since there is no need to have a public API to trigger such operation. 7. TopicMetadataReqeust will be evolved to version 1 - topic-level configuration per topic will be included and ISR field will be removed. Automatic topic-creation logic will be removed (we will use CreateTopicRequest for that). Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 12:23 AM, Jun Rao j...@confluent.io wrote: Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though. Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in reassignPartitions is the following: 1. Update AR in ZK with OAR + RAR. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. 12. After electing leader, the replicas and isr information changes. So resend the update metadata request to every broker. In other words AR becomes RAR right before removing partitions from the admin path. I think we can consider (with a little approximation) reassignment completed if AR == RAR. If it's okay, I will remove ISR and add topic config in one change as discussed earlier. Thanks, Andrii Biletskyi On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao j...@confluent.io wrote: Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0).
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, A quick summary of our today's meeting. There were no additional issues/questions. The only item about which we are not 100% sure is multiple instructions for one topic in one request case. It was proposed by Jun to explain reasons behind not allowing users doing that again here in mailing list, and in case we implement it in final version document it well so API clients understand what exactly is not allowed and why. At the meantime I will update the KIP. After that I will start voting thread. Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 10:33 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, It seems that there are no open questions left so prior to our weekly call let me summarize what I'm going to implement as part of phase one for KIP-4. 1. Add 3 new Wire Protocol requests - Create-, Alter- and DeleteTopicRequest 2. Topic requests are batch requests, errors are returned per topic as part of batch response. 3. Topic requests are asynchronous - respective commands are only started and server is not blocked until command is finished. 4. It will be not allowed to specify multiple mutations for the same topic in scope of one batch request - a special error will be returned for such topic. 5. There will be no dedicated request for reassign-partitions - it is simulated with AlterTopicRequest.ReplicaAssignment field. 6. Preferred-replica-leader-election is not supported since there is no need to have a public API to trigger such operation. 7. TopicMetadataReqeust will be evolved to version 1 - topic-level configuration per topic will be included and ISR field will be removed. Automatic topic-creation logic will be removed (we will use CreateTopicRequest for that). Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 12:23 AM, Jun Rao j...@confluent.io wrote: Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though. Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in reassignPartitions is the following: 1. Update AR in ZK with OAR + RAR. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. 12. After electing leader, the replicas and isr information changes. So resend the update metadata request to every broker. In other words AR becomes RAR right before removing partitions from the admin path. I think we can consider (with a little approximation) reassignment completed if AR == RAR. If it's okay, I will remove ISR and add topic config in one change as discussed earlier. Thanks, Andrii Biletskyi On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao j...@confluent.io wrote: Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0). Thanks, Jun On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, It seems that there are no open questions left so prior to our weekly call let me summarize what I'm going to implement as part of phase one for KIP-4. 1. Add 3 new Wire Protocol requests - Create-, Alter- and DeleteTopicRequest 2. Topic requests are batch requests, errors are returned per topic as part of batch response. 3. Topic requests are asynchronous - respective commands are only started and server is not blocked until command is finished. 4. It will be not allowed to specify multiple mutations for the same topic in scope of one batch request - a special error will be returned for such topic. 5. There will be no dedicated request for reassign-partitions - it is simulated with AlterTopicRequest.ReplicaAssignment field. 6. Preferred-replica-leader-election is not supported since there is no need to have a public API to trigger such operation. 7. TopicMetadataReqeust will be evolved to version 1 - topic-level configuration per topic will be included and ISR field will be removed. Automatic topic-creation logic will be removed (we will use CreateTopicRequest for that). Thanks, Andrii Biletskyi On Tue, Apr 28, 2015 at 12:23 AM, Jun Rao j...@confluent.io wrote: Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though. Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in reassignPartitions is the following: 1. Update AR in ZK with OAR + RAR. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. 12. After electing leader, the replicas and isr information changes. So resend the update metadata request to every broker. In other words AR becomes RAR right before removing partitions from the admin path. I think we can consider (with a little approximation) reassignment completed if AR == RAR. If it's okay, I will remove ISR and add topic config in one change as discussed earlier. Thanks, Andrii Biletskyi On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao j...@confluent.io wrote: Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0). Thanks, Jun On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in reassignPartitions is the following: 1. Update AR in ZK with OAR + RAR. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. 12. After electing leader, the replicas and isr information changes. So resend the update metadata request to every broker. In other words AR becomes RAR right before removing partitions from the admin path. I think we can consider (with a little approximation) reassignment completed if AR == RAR. If it's okay, I will remove ISR and add topic config in one change as discussed earlier. Thanks, Andrii Biletskyi On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao j...@confluent.io wrote: Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0). Thanks, Jun On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json is not empty, write it to the reassignment path in ZK to trigger partition reassignment. b. Otherwise, if replicas is specified, generate new ReplicaAssignment for existing partitions. If partitions is specified (assuming it's larger), generate ReplicaAssignment for the new partitions as well. Then go back to step a to make a decision. c. Otherwise, if only partitions is specified, add assignments of existing partitions to ReplicaAssignment. Generate assignments to the new partitions and add them to ReplicaAssignment. Then go back to step a to make a decision. Thanks, Jun On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though. Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Okay, I had some doubts in terms of reassign-partitions case. I was not sure whether we need ISR to check post condition of partition reassignment. But I think we can rely on assigned replicas - the workflow in reassignPartitions is the following: 1. Update AR in ZK with OAR + RAR. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. 12. After electing leader, the replicas and isr information changes. So resend the update metadata request to every broker. In other words AR becomes RAR right before removing partitions from the admin path. I think we can consider (with a little approximation) reassignment completed if AR == RAR. If it's okay, I will remove ISR and add topic config in one change as discussed earlier. Thanks, Andrii Biletskyi On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao j...@confluent.io wrote: Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0). Thanks, Jun On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json is not empty, write it to the reassignment path in ZK to trigger partition reassignment. b. Otherwise, if replicas is specified, generate new ReplicaAssignment for existing partitions. If partitions is specified (assuming it's larger), generate ReplicaAssignment for the new partitions as well. Then go back to step a to make a decision. c. Otherwise, if only partitions is specified, add assignments of existing partitions to ReplicaAssignment. Generate assignments to the new partitions and add them to ReplicaAssignment. Then go back to step a to make a decision. Thanks, Jun On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Andrii, Another thing. We decided not to add the lag info in TMR. To be consistent, we probably also want to remove ISR from TMR since only the leader knows it. We can punt on adding any new request from getting ISR. ISR is mostly useful for monitoring. Currently, one can determine if a replica is in ISR from the lag metrics (a replica is in ISR if its lag is =0). Thanks, Jun On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json is not empty, write it to the reassignment path in ZK to trigger partition reassignment. b. Otherwise, if replicas is specified, generate new ReplicaAssignment for existing partitions. If partitions is specified (assuming it's larger), generate ReplicaAssignment for the new partitions as well. Then go back to step a to make a decision. c. Otherwise, if only partitions is specified, add assignments of existing partitions to ReplicaAssignment. Generate assignments to the new partitions and add them to ReplicaAssignment. Then go back to step a to make a decision. Thanks, Jun On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions request 2) If either Partitions or ReplicationFactor is specified: a) If Partitions specified - this is increase partitions case b) If ReplicationFactor is specified - this means we need to automatically regenerate replica assignment and treat it as reassign partitions request Note: this algorithm is a bit inconsistent with the CreateTopicRequest - with ReplicaAssignment specified there user can implicitly define Partitions and ReplicationFactor, in AlterTopicRequest those are completely different things, i.e. you can't include new partitions to the ReplicaAssignment to implicitly ask controller to increase partitions - controller will simply return InvalidReplicaAssignment, because you included unknown partitions. Secondly, multiple instructions for one topic in batch request. I have a feeling it becomes a really big mess now, so suggestions are highly appreciated here! Our
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json is not empty, write it to the reassignment path in ZK to trigger partition reassignment. b. Otherwise, if replicas is specified, generate new ReplicaAssignment for existing partitions. If partitions is specified (assuming it's larger), generate ReplicaAssignment for the new partitions as well. Then go back to step a to make a decision. c. Otherwise, if only partitions is specified, add assignments of existing partitions to ReplicaAssignment. Generate assignments to the new partitions and add them to ReplicaAssignment. Then go back to step a to make a decision. Thanks, Jun On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions request 2) If either Partitions or ReplicationFactor is specified: a) If Partitions specified - this is increase partitions case b) If ReplicationFactor is specified - this means we need to automatically regenerate replica assignment and treat it as reassign partitions request Note: this algorithm is a bit inconsistent with the CreateTopicRequest - with ReplicaAssignment specified there user can implicitly define Partitions and ReplicationFactor, in AlterTopicRequest those are completely different things, i.e. you can't include new partitions to the ReplicaAssignment to implicitly ask controller to increase partitions - controller will simply return InvalidReplicaAssignment, because you included unknown partitions. Secondly, multiple instructions for one topic in batch request. I have a feeling it becomes a really big mess now, so suggestions are highly appreciated here! Our goal is to consider whether we can let users add multiple instructions for one topic in one batch but at the same time make it transparent enough so we can support blocking on request completion, for that we need to analyze from the request what is the final expected state of the topic. And the latter one seems to me a tough issue. Consider the following AlterTopicRequest: [1) topic1: change ReplicationFactor from 2 to 3, 2) topic1: change ReplicaAssignment (taking into account RF is 3 now), 3) topic2: change ReplicaAssignment (just to include multiple topics) 4) topic1: change ReplicationFactor from 3 to 1, 5) topic1: change ReplicaAssignment again (taking into account RF is 1 now) ] As we discussed earlier, controller will handle it as alter-topic command and reassign-partitions. First of all, it will scan all ReplicaAssignment and assembly those to one json to create admin path /reassign_partitions once needed. Now, user would expect we execute instruction sequentially, but we can't do it because only one reassign-partitions procedure can be in progress - when should we trigger reassign-partition - after 1) or after 4)? And what about topic2 - we will break the order, but it was supposed we execute instructions sequentially. Overall, the logic seems to be very sophisticated, which is a bad sign. Conceptually, I think the root problem is that we imply there is an order in sequential instructions, but instructions themselves are asynchronous, so really you can't
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Jun, I like your approach to AlterTopicReques semantics! Sounds like we linearize all request fields to ReplicaAssignment - I will definitely try this out to ensure there are no other pitfalls. With regards to multiple instructions in one batch per topic. For me this sounds reasonable too. We discussed last time that it's pretty strange we give users schema that supports batching and at the same time introduce restrictions to the way batching can be used (in this case - only one instruction per topic). But now, when we give users everything they need to avoid such misleading use cases (if we implement the previous item - user will be able to specify/change all fields in one instruction) - it might be a good justification to prohibit serving such requests. Any objections? Thanks, Andrii BIletskyi On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao j...@confluent.io wrote: Andrii, Thanks for the update. For your second point, I agree that if a single AlterTopicRequest can make multiple changes, there is no need to support the same topic included more than once in the request. Now about the semantics in your first question. I was thinking that we can do the following. a. If ReplicaAssignment is specified, we expect that this will specify the replica assignment for all partitions in the topic. For now, we can have the constraint that there could be more partitions than existing ones, but can't be less. In this case, both partitions and replicas are ignored. Then for each partition, we do one of the followings. a1. If the partition doesn't exist, add the partition with the replica assignment directly to the topic path in ZK. a2. If the partition exists and the new replica assignment is not the same as the existing one, include it in the reassign partition json. If the json is not empty, write it to the reassignment path in ZK to trigger partition reassignment. b. Otherwise, if replicas is specified, generate new ReplicaAssignment for existing partitions. If partitions is specified (assuming it's larger), generate ReplicaAssignment for the new partitions as well. Then go back to step a to make a decision. c. Otherwise, if only partitions is specified, add assignments of existing partitions to ReplicaAssignment. Generate assignments to the new partitions and add them to ReplicaAssignment. Then go back to step a to make a decision. Thanks, Jun On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions request 2) If either Partitions or ReplicationFactor is specified: a) If Partitions specified - this is increase partitions case b) If ReplicationFactor is specified - this means we need to automatically regenerate replica assignment and treat it as reassign partitions request Note: this algorithm is a bit inconsistent with the CreateTopicRequest - with ReplicaAssignment specified there user can implicitly define Partitions and ReplicationFactor, in AlterTopicRequest those are completely different things, i.e. you can't include new partitions to the ReplicaAssignment to implicitly ask controller to increase partitions - controller will simply return InvalidReplicaAssignment, because you included unknown partitions. Secondly, multiple instructions for one topic in batch request. I have a feeling it becomes a really big mess now, so suggestions are highly appreciated here! Our goal is to consider whether we can let users add multiple instructions for one topic in one batch but at the same time make it transparent enough so we can support blocking on request completion, for that we need to analyze from the request what is the final expected state of the topic. And the latter one seems to me a tough issue. Consider the following AlterTopicRequest: [1) topic1: change ReplicationFactor from 2 to 3, 2) topic1: change ReplicaAssignment (taking into account RF is 3 now), 3) topic2: change ReplicaAssignment (just to include multiple
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, Can we come to some agreement in terms of the second item from the email above? This blocks me from updating and uploading the patch. Also the new schedule for the weekly calls doesn't work very well for me - it's 1 am in my timezone :) - so I'd rather we confirm everything that is possible by email. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions request 2) If either Partitions or ReplicationFactor is specified: a) If Partitions specified - this is increase partitions case b) If ReplicationFactor is specified - this means we need to automatically regenerate replica assignment and treat it as reassign partitions request Note: this algorithm is a bit inconsistent with the CreateTopicRequest - with ReplicaAssignment specified there user can implicitly define Partitions and ReplicationFactor, in AlterTopicRequest those are completely different things, i.e. you can't include new partitions to the ReplicaAssignment to implicitly ask controller to increase partitions - controller will simply return InvalidReplicaAssignment, because you included unknown partitions. Secondly, multiple instructions for one topic in batch request. I have a feeling it becomes a really big mess now, so suggestions are highly appreciated here! Our goal is to consider whether we can let users add multiple instructions for one topic in one batch but at the same time make it transparent enough so we can support blocking on request completion, for that we need to analyze from the request what is the final expected state of the topic. And the latter one seems to me a tough issue. Consider the following AlterTopicRequest: [1) topic1: change ReplicationFactor from 2 to 3, 2) topic1: change ReplicaAssignment (taking into account RF is 3 now), 3) topic2: change ReplicaAssignment (just to include multiple topics) 4) topic1: change ReplicationFactor from 3 to 1, 5) topic1: change ReplicaAssignment again (taking into account RF is 1 now) ] As we discussed earlier, controller will handle it as alter-topic command and reassign-partitions. First of all, it will scan all ReplicaAssignment and assembly those to one json to create admin path /reassign_partitions once needed. Now, user would expect we execute instruction sequentially, but we can't do it because only one reassign-partitions procedure can be in progress - when should we trigger reassign-partition - after 1) or after 4)? And what about topic2 - we will break the order, but it was supposed we execute instructions sequentially. Overall, the logic seems to be very sophisticated, which is a bad sign. Conceptually, I think the root problem is that we imply there is an order in sequential instructions, but instructions themselves are asynchronous, so really you can't guarantee any order. I'm thinking about such solutions now: 1) Prohibit multiple instructions (this seems reasonable if we let users change multiple things in scope of now instructions - see the first item) 2) Break apart again AlterTopic and ReassignPartitions - if the reassignment case is the only problem here, which I'm not sure about. Thoughts? Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 2:59 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thank you for your time. A short summary of our discussion. Answering previous items: 1. 2. I will double check existing error codes to align the list of errors that needs to be added. 3. We agreed to think again about the batch requests semantics. The main concern is that users would expect we allow executing multiple instructions for one topic in one batch. I will start implementation and check whether there are any impediments to handle it this way. The same for AlterTopicRequest - I will try to make request semantics as easy as possible and allow users change different things at one time - e.g. change nr of partitions and replicas in one instruction. 4. We agreed not to add to TMR lag information. 5. We discussed preferred replica command and it was pointed out that generally users shouldn't call this command manually now since this is automatically handled by the cluster. If there are no objections (especially from devops people) I will remove respective request. 6. As discussed AdminClient API is a phase 2 and will go after Wire Protocol extensions. It will be finalized as java-doc after I complete patch for phase 1 - Wire
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
As said above, I spent some time thinking about AlterTopicRequest semantics and batching. Firstly, about AlterTopicRequest. Our goal here is to see whether we can suggest some simple semantics and at the same time let users change different things in one instruction (hereinafter instruction - is one of the entries in batch request). We can resolve arguments according to this schema: 1) If ReplicaAsignment is specified: it's a reassign partitions request 2) If either Partitions or ReplicationFactor is specified: a) If Partitions specified - this is increase partitions case b) If ReplicationFactor is specified - this means we need to automatically regenerate replica assignment and treat it as reassign partitions request Note: this algorithm is a bit inconsistent with the CreateTopicRequest - with ReplicaAssignment specified there user can implicitly define Partitions and ReplicationFactor, in AlterTopicRequest those are completely different things, i.e. you can't include new partitions to the ReplicaAssignment to implicitly ask controller to increase partitions - controller will simply return InvalidReplicaAssignment, because you included unknown partitions. Secondly, multiple instructions for one topic in batch request. I have a feeling it becomes a really big mess now, so suggestions are highly appreciated here! Our goal is to consider whether we can let users add multiple instructions for one topic in one batch but at the same time make it transparent enough so we can support blocking on request completion, for that we need to analyze from the request what is the final expected state of the topic. And the latter one seems to me a tough issue. Consider the following AlterTopicRequest: [1) topic1: change ReplicationFactor from 2 to 3, 2) topic1: change ReplicaAssignment (taking into account RF is 3 now), 3) topic2: change ReplicaAssignment (just to include multiple topics) 4) topic1: change ReplicationFactor from 3 to 1, 5) topic1: change ReplicaAssignment again (taking into account RF is 1 now) ] As we discussed earlier, controller will handle it as alter-topic command and reassign-partitions. First of all, it will scan all ReplicaAssignment and assembly those to one json to create admin path /reassign_partitions once needed. Now, user would expect we execute instruction sequentially, but we can't do it because only one reassign-partitions procedure can be in progress - when should we trigger reassign-partition - after 1) or after 4)? And what about topic2 - we will break the order, but it was supposed we execute instructions sequentially. Overall, the logic seems to be very sophisticated, which is a bad sign. Conceptually, I think the root problem is that we imply there is an order in sequential instructions, but instructions themselves are asynchronous, so really you can't guarantee any order. I'm thinking about such solutions now: 1) Prohibit multiple instructions (this seems reasonable if we let users change multiple things in scope of now instructions - see the first item) 2) Break apart again AlterTopic and ReassignPartitions - if the reassignment case is the only problem here, which I'm not sure about. Thoughts? Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 2:59 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thank you for your time. A short summary of our discussion. Answering previous items: 1. 2. I will double check existing error codes to align the list of errors that needs to be added. 3. We agreed to think again about the batch requests semantics. The main concern is that users would expect we allow executing multiple instructions for one topic in one batch. I will start implementation and check whether there are any impediments to handle it this way. The same for AlterTopicRequest - I will try to make request semantics as easy as possible and allow users change different things at one time - e.g. change nr of partitions and replicas in one instruction. 4. We agreed not to add to TMR lag information. 5. We discussed preferred replica command and it was pointed out that generally users shouldn't call this command manually now since this is automatically handled by the cluster. If there are no objections (especially from devops people) I will remove respective request. 6. As discussed AdminClient API is a phase 2 and will go after Wire Protocol extensions. It will be finalized as java-doc after I complete patch for phase 1 - Wire Protocol + server-side code handling requests. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 12:36 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, thanks for all the hard work on this, it has come a long way. A couple questions and comments on this. For the errors, can we do the following: 1. Remove IllegalArgument from the name, we haven't used that convention for other errors. 2. Normalize this list with the existing errors. For example, elsewhere when you give an invalid
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
1. Yes, this will be much easier. Okay, let's add it. 2, Okay. This will differ a little bit from the way currently kafka-topics.sh handles alter-topic command, but I think it's a reasonable restriction. I'll update KIP acordingly to our weekly call. Thanks, Andrii Biletskyi On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao j...@confluent.io wrote: 1. Yes, lag is probably only going to be useful for the admin client. However, so is isr. It seems to me that we should get lag and isr from the same request. I was thinking that we can just extend TMR by changing replicas from an array of int to an array of (int, lag) pairs. Is that too complicated? 3. I was thinking that we just don't allow the cli to change more than one thing at a time. So, you will get an error if you want to change both partitions and configs. Thanks, Jun On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 1. Yes, seems we can add lag info to the TMR. But before that I wonder whether there are other reasons we need this info except for reassign partition command? As we discussed earlier the problem with poor monitoring capabilities for reassign-partitions (as currently we only inform users Completed/In Progress per partition) may require separate solution. We were thinking about separate Wire protocol request. And I actually like your idea about adding some sort of BrokerMetadataRequest for these purposes. I actually think we can cover some other items (like rack-awareness) but for me it deserves a separate KIP really. Also, adding Replica-Lag map per partition will make TopicMetadataResponse very sophisticated: Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]]. Maybe we need to leave it for a moment and propose new request rather than making a new step towards one monster request. 2. Yes, error per topic. The only question is whether we should execute at least the very first alter topic command from the duplicated topic set or return error for all ... I think the more predictable and reasonable option for clients would be returning errors for all duplicated topics. 3. Hm, yes. Actually we also have change topic config there. But it is not related to such replication commands as increase replicas or change replica assignment. This will make CLI implementation a bit strange: if user specifies increase partitions and change topic config in one line - taking into account 2. we will have to create two separate alter topic requests, which were designed as batch requests :), but probably we can live with it. Okay, I will think about a separate error code to cover such cases. 4. We will need InvalidArgumentTopic (e.g. contains prohibited chars), IAPartitions, IAReplicas, IAReplicaAssignment, IATopicConfiguration. A server side implementation will be a little bit messy (like dozens if this then this error code) but maybe we should think about clients at the first place here. Thanks, Andrii Biletskyi On Fri, Apr 17, 2015 at 1:46 AM, Jun Rao j...@confluent.io wrote: 1. For the lags, we can add a new field lags per partition. It will return for each replica that's not in isr, the replica id and the lag in messages. Also, if TMR is sent to a non-leader, the response can just include an empty array for isr and lags. 2. So, we will just return a topic level error for the duplicated topics, right? 3. Yes, it's true that today, one can specify both partitions and replicaAssignment in the TopicCommand. However, partitions is actually ignored. So, it will be clearer if we don't allow users to do this. 4. How many specific error codes like InvalidPartitions and InvalidReplicas are needed? If it's not that many, giving out more specific error will be useful for non-java clients. Thanks, Jun On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and ask those for a correct isr list. Also, we should consider adding lag information to TMR for each follower for partition reassignment, as Jun suggested above. 2. Q: What if user adds different alter commands for the same topic in scope of one batch request? A: Because of the async nature of AlterTopicRequest it will be very hard then to assemble the expected (in terms of checking
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Hi all, I've updated KIP-4 page to include all previously discussed items such as: new error codes, merged alter-topic and reassign-partitions requests, added TMR_V1. It'd be great if we concentrate on the Errors+Wire Protocol schema and discuss any remaining issues today, since first patch will include only server-side implementation. Thanks, Andrii Biletskyi On Tue, Apr 21, 2015 at 9:46 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: 1. Yes, this will be much easier. Okay, let's add it. 2, Okay. This will differ a little bit from the way currently kafka-topics.sh handles alter-topic command, but I think it's a reasonable restriction. I'll update KIP acordingly to our weekly call. Thanks, Andrii Biletskyi On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao j...@confluent.io wrote: 1. Yes, lag is probably only going to be useful for the admin client. However, so is isr. It seems to me that we should get lag and isr from the same request. I was thinking that we can just extend TMR by changing replicas from an array of int to an array of (int, lag) pairs. Is that too complicated? 3. I was thinking that we just don't allow the cli to change more than one thing at a time. So, you will get an error if you want to change both partitions and configs. Thanks, Jun On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 1. Yes, seems we can add lag info to the TMR. But before that I wonder whether there are other reasons we need this info except for reassign partition command? As we discussed earlier the problem with poor monitoring capabilities for reassign-partitions (as currently we only inform users Completed/In Progress per partition) may require separate solution. We were thinking about separate Wire protocol request. And I actually like your idea about adding some sort of BrokerMetadataRequest for these purposes. I actually think we can cover some other items (like rack-awareness) but for me it deserves a separate KIP really. Also, adding Replica-Lag map per partition will make TopicMetadataResponse very sophisticated: Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]]. Maybe we need to leave it for a moment and propose new request rather than making a new step towards one monster request. 2. Yes, error per topic. The only question is whether we should execute at least the very first alter topic command from the duplicated topic set or return error for all ... I think the more predictable and reasonable option for clients would be returning errors for all duplicated topics. 3. Hm, yes. Actually we also have change topic config there. But it is not related to such replication commands as increase replicas or change replica assignment. This will make CLI implementation a bit strange: if user specifies increase partitions and change topic config in one line - taking into account 2. we will have to create two separate alter topic requests, which were designed as batch requests :), but probably we can live with it. Okay, I will think about a separate error code to cover such cases. 4. We will need InvalidArgumentTopic (e.g. contains prohibited chars), IAPartitions, IAReplicas, IAReplicaAssignment, IATopicConfiguration. A server side implementation will be a little bit messy (like dozens if this then this error code) but maybe we should think about clients at the first place here. Thanks, Andrii Biletskyi On Fri, Apr 17, 2015 at 1:46 AM, Jun Rao j...@confluent.io wrote: 1. For the lags, we can add a new field lags per partition. It will return for each replica that's not in isr, the replica id and the lag in messages. Also, if TMR is sent to a non-leader, the response can just include an empty array for isr and lags. 2. So, we will just return a topic level error for the duplicated topics, right? 3. Yes, it's true that today, one can specify both partitions and replicaAssignment in the TopicCommand. However, partitions is actually ignored. So, it will be clearer if we don't allow users to do this. 4. How many specific error codes like InvalidPartitions and InvalidReplicas are needed? If it's not that many, giving out more specific error will be useful for non-java clients. Thanks, Jun On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Hey Andrii, thanks for all the hard work on this, it has come a long way. A couple questions and comments on this. For the errors, can we do the following: 1. Remove IllegalArgument from the name, we haven't used that convention for other errors. 2. Normalize this list with the existing errors. For example, elsewhere when you give an invalid topic name we give back an InvalidTopicException but this is proposing a new error for that. It would be good that these kinds of errors are handled the same way across all requests in the protocol. Other comments: 3. I don't understand MultipleInstructionsForOneTopic and MultipleTopicInstructionsInOneBatch and the description is quite vague. There is some implicit assumption in this proposal about how batching will be done that doesn't seem to be explained. 4. I think adding replica lag to the metadata request is out of place and should not be in the metadata request. Two reasons: a. This is something that can only be answered by the leader for that partition. So querying N partitions fundamentally mean querying N brokers (roughly). This is different from the other properties which are shared knowledge. b. This is a monitoring property not a configuration/metadata property. I recommend we remove this here and in the future add an API that gets all the monitoring stats from the server including lag. Adding all these to the metadata request won't make sense, right? 5. This includes a special request for preferred replica leader election. I feel that we should not expose an API for this because the user should not be in the business of managing leaders. We have gotten this feature to the point where preferred leadership election is enabled automatically. I think we should go further in that direction and do whatever work is required to make this the only option rather than trying to institute public apis for manually controlling it. 6. The API changes we discussed for the java api still aren't reflected in the proposal. -Jay On Tue, Apr 21, 2015 at 7:47 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP-4 page to include all previously discussed items such as: new error codes, merged alter-topic and reassign-partitions requests, added TMR_V1. It'd be great if we concentrate on the Errors+Wire Protocol schema and discuss any remaining issues today, since first patch will include only server-side implementation. Thanks, Andrii Biletskyi On Tue, Apr 21, 2015 at 9:46 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: 1. Yes, this will be much easier. Okay, let's add it. 2, Okay. This will differ a little bit from the way currently kafka-topics.sh handles alter-topic command, but I think it's a reasonable restriction. I'll update KIP acordingly to our weekly call. Thanks, Andrii Biletskyi On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao j...@confluent.io wrote: 1. Yes, lag is probably only going to be useful for the admin client. However, so is isr. It seems to me that we should get lag and isr from the same request. I was thinking that we can just extend TMR by changing replicas from an array of int to an array of (int, lag) pairs. Is that too complicated? 3. I was thinking that we just don't allow the cli to change more than one thing at a time. So, you will get an error if you want to change both partitions and configs. Thanks, Jun On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 1. Yes, seems we can add lag info to the TMR. But before that I wonder whether there are other reasons we need this info except for reassign partition command? As we discussed earlier the problem with poor monitoring capabilities for reassign-partitions (as currently we only inform users Completed/In Progress per partition) may require separate solution. We were thinking about separate Wire protocol request. And I actually like your idea about adding some sort of BrokerMetadataRequest for these purposes. I actually think we can cover some other items (like rack-awareness) but for me it deserves a separate KIP really. Also, adding Replica-Lag map per partition will make TopicMetadataResponse very sophisticated: Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]]. Maybe we need to leave it for a moment and propose new request rather than making a new step towards one monster request. 2. Yes, error per topic. The only question is whether we should execute at least the very first alter topic command from the duplicated topic set or return error for all ... I think the more predictable and reasonable option for clients would be returning errors for all duplicated topics. 3. Hm, yes. Actually we also have change topic config there. But it is not related to such replication commands as increase replicas or change replica
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, Thank you for your time. A short summary of our discussion. Answering previous items: 1. 2. I will double check existing error codes to align the list of errors that needs to be added. 3. We agreed to think again about the batch requests semantics. The main concern is that users would expect we allow executing multiple instructions for one topic in one batch. I will start implementation and check whether there are any impediments to handle it this way. The same for AlterTopicRequest - I will try to make request semantics as easy as possible and allow users change different things at one time - e.g. change nr of partitions and replicas in one instruction. 4. We agreed not to add to TMR lag information. 5. We discussed preferred replica command and it was pointed out that generally users shouldn't call this command manually now since this is automatically handled by the cluster. If there are no objections (especially from devops people) I will remove respective request. 6. As discussed AdminClient API is a phase 2 and will go after Wire Protocol extensions. It will be finalized as java-doc after I complete patch for phase 1 - Wire Protocol + server-side code handling requests. Thanks, Andrii Biletskyi On Wed, Apr 22, 2015 at 12:36 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, thanks for all the hard work on this, it has come a long way. A couple questions and comments on this. For the errors, can we do the following: 1. Remove IllegalArgument from the name, we haven't used that convention for other errors. 2. Normalize this list with the existing errors. For example, elsewhere when you give an invalid topic name we give back an InvalidTopicException but this is proposing a new error for that. It would be good that these kinds of errors are handled the same way across all requests in the protocol. Other comments: 3. I don't understand MultipleInstructionsForOneTopic and MultipleTopicInstructionsInOneBatch and the description is quite vague. There is some implicit assumption in this proposal about how batching will be done that doesn't seem to be explained. 4. I think adding replica lag to the metadata request is out of place and should not be in the metadata request. Two reasons: a. This is something that can only be answered by the leader for that partition. So querying N partitions fundamentally mean querying N brokers (roughly). This is different from the other properties which are shared knowledge. b. This is a monitoring property not a configuration/metadata property. I recommend we remove this here and in the future add an API that gets all the monitoring stats from the server including lag. Adding all these to the metadata request won't make sense, right? 5. This includes a special request for preferred replica leader election. I feel that we should not expose an API for this because the user should not be in the business of managing leaders. We have gotten this feature to the point where preferred leadership election is enabled automatically. I think we should go further in that direction and do whatever work is required to make this the only option rather than trying to institute public apis for manually controlling it. 6. The API changes we discussed for the java api still aren't reflected in the proposal. -Jay On Tue, Apr 21, 2015 at 7:47 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP-4 page to include all previously discussed items such as: new error codes, merged alter-topic and reassign-partitions requests, added TMR_V1. It'd be great if we concentrate on the Errors+Wire Protocol schema and discuss any remaining issues today, since first patch will include only server-side implementation. Thanks, Andrii Biletskyi On Tue, Apr 21, 2015 at 9:46 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: 1. Yes, this will be much easier. Okay, let's add it. 2, Okay. This will differ a little bit from the way currently kafka-topics.sh handles alter-topic command, but I think it's a reasonable restriction. I'll update KIP acordingly to our weekly call. Thanks, Andrii Biletskyi On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao j...@confluent.io wrote: 1. Yes, lag is probably only going to be useful for the admin client. However, so is isr. It seems to me that we should get lag and isr from the same request. I was thinking that we can just extend TMR by changing replicas from an array of int to an array of (int, lag) pairs. Is that too complicated? 3. I was thinking that we just don't allow the cli to change more than one thing at a time. So, you will get an error if you want to change both partitions and configs. Thanks, Jun On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 1. Yes, seems we can add lag info to the TMR. But before
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
1. Yes, lag is probably only going to be useful for the admin client. However, so is isr. It seems to me that we should get lag and isr from the same request. I was thinking that we can just extend TMR by changing replicas from an array of int to an array of (int, lag) pairs. Is that too complicated? 3. I was thinking that we just don't allow the cli to change more than one thing at a time. So, you will get an error if you want to change both partitions and configs. Thanks, Jun On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 1. Yes, seems we can add lag info to the TMR. But before that I wonder whether there are other reasons we need this info except for reassign partition command? As we discussed earlier the problem with poor monitoring capabilities for reassign-partitions (as currently we only inform users Completed/In Progress per partition) may require separate solution. We were thinking about separate Wire protocol request. And I actually like your idea about adding some sort of BrokerMetadataRequest for these purposes. I actually think we can cover some other items (like rack-awareness) but for me it deserves a separate KIP really. Also, adding Replica-Lag map per partition will make TopicMetadataResponse very sophisticated: Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]]. Maybe we need to leave it for a moment and propose new request rather than making a new step towards one monster request. 2. Yes, error per topic. The only question is whether we should execute at least the very first alter topic command from the duplicated topic set or return error for all ... I think the more predictable and reasonable option for clients would be returning errors for all duplicated topics. 3. Hm, yes. Actually we also have change topic config there. But it is not related to such replication commands as increase replicas or change replica assignment. This will make CLI implementation a bit strange: if user specifies increase partitions and change topic config in one line - taking into account 2. we will have to create two separate alter topic requests, which were designed as batch requests :), but probably we can live with it. Okay, I will think about a separate error code to cover such cases. 4. We will need InvalidArgumentTopic (e.g. contains prohibited chars), IAPartitions, IAReplicas, IAReplicaAssignment, IATopicConfiguration. A server side implementation will be a little bit messy (like dozens if this then this error code) but maybe we should think about clients at the first place here. Thanks, Andrii Biletskyi On Fri, Apr 17, 2015 at 1:46 AM, Jun Rao j...@confluent.io wrote: 1. For the lags, we can add a new field lags per partition. It will return for each replica that's not in isr, the replica id and the lag in messages. Also, if TMR is sent to a non-leader, the response can just include an empty array for isr and lags. 2. So, we will just return a topic level error for the duplicated topics, right? 3. Yes, it's true that today, one can specify both partitions and replicaAssignment in the TopicCommand. However, partitions is actually ignored. So, it will be clearer if we don't allow users to do this. 4. How many specific error codes like InvalidPartitions and InvalidReplicas are needed? If it's not that many, giving out more specific error will be useful for non-java clients. Thanks, Jun On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and ask those for a correct isr list. Also, we should consider adding lag information to TMR for each follower for partition reassignment, as Jun suggested above. 2. Q: What if user adds different alter commands for the same topic in scope of one batch request? A: Because of the async nature of AlterTopicRequest it will be very hard then to assemble the expected (in terms of checking whether request is complete) result if we let users do this. Also it will be very confusing. It was proposed not to let users do this (probably add new Error for such cases). 3. Q: AlterTopicRequest semantics: now when we merged AlterTopic and ReassingPartitons in which order AlterTopic fields should be resolved? A: This item is not clear. There was a proposal to let user change only one thing at a time,
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
1. agreed 2. agree new error 3. having discrete operations for tasks makes sense, combining them is confusing for users I think. + 1 for let user change only one thing at a time 4. lets be consistent both to the new code and existing code. lets not confuse the user but give them the right error information so they know what they did wrong without much fuss. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Wed, Apr 15, 2015 at 1:23 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and ask those for a correct isr list. Also, we should consider adding lag information to TMR for each follower for partition reassignment, as Jun suggested above. 2. Q: What if user adds different alter commands for the same topic in scope of one batch request? A: Because of the async nature of AlterTopicRequest it will be very hard then to assemble the expected (in terms of checking whether request is complete) result if we let users do this. Also it will be very confusing. It was proposed not to let users do this (probably add new Error for such cases). 3. Q: AlterTopicRequest semantics: now when we merged AlterTopic and ReassingPartitons in which order AlterTopic fields should be resolved? A: This item is not clear. There was a proposal to let user change only one thing at a time, e.g. specify either new Replicas, or ReplicaAssignment. This can be a simple solution, but it's a very strict rule. E.g. currently with TopicCommand user can increase nr of partitions and define replica assignment for newly added partitions. Taking into account item 2. this will be even harder for user to achieve this. 4. Q: Do we need such accurate errors returned from the server: InvalidArgumentPartitions, InvalidArgumentReplicas etc. A: I started implementation to add proposed error codes and now I think probably InvalidArgumentError should be sufficient. We can do simple validations on the client side (e.g. AdminClient can ensure nr of partitions argument is positive), others - which can be covered only on server (probably invalid topic config, replica assignment includes dead broker etc) - will be done on server, and in case of invalid argument we will return InvalidArgumentError without specifying the concrete field. It'd be great if we could cover these remaining issues, looks like they are minor, at least related to specific messages, not the overall protocol. - I think with that I can update confluence page and update patch to reflect all discussed items. This patch will probably include Wire protocol messages and server-side code to handle new requests. AdminClient and cli-tool implementation can be the next step. Thanks, Andrii Biletskyi On Wed, Apr 15, 2015 at 7:26 PM, Jun Rao j...@confluent.io wrote: Andrii, 500. I think what you suggested also sounds reasonable. Since ISR is only maintained accurately at the leader, TMR can return ISR if the broker is the leader of a partition. Otherwise, we can return an empty ISR. For partition reassignment, it would be useful to know the lag of each follower. Again, the leader knows this info. We can probably include that info in TMR as well. 300. I think it's probably reasonable to restrict AlterTopicRequest to change only one thing at a time, i.e., either partitions, replicas, replica assignment or config. Thanks, Jun On Mon, Apr 13, 2015 at 10:56 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 404. Great, thanks! 500. If I understand correctly KAFKA-1367 says ISR part of TMR may be inconsistent. If so, then I believe all admin commands but describeTopic are not affected. Let me emphasize that it's about AdminClient operations, not about Wire Protocol requests. What I mean: To verify AdminClient.createTopic we will need (consistent) 'topics' set from TMR (we don't need isr) To verify alterTopic - again, probably 'topics' and 'assigned replicas' + configs To verify deleteTopic - only 'topics' To verify preferredReplica - 'leader', 'assigned replicas' To verify reassignPartitions - 'assigned replicas' ? (I'm not sure about this one) If everything above is correct, then AdminClient.describeTopic is the only command under risk. We can actually workaround it - find out
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
1. For the lags, we can add a new field lags per partition. It will return for each replica that's not in isr, the replica id and the lag in messages. Also, if TMR is sent to a non-leader, the response can just include an empty array for isr and lags. 2. So, we will just return a topic level error for the duplicated topics, right? 3. Yes, it's true that today, one can specify both partitions and replicaAssignment in the TopicCommand. However, partitions is actually ignored. So, it will be clearer if we don't allow users to do this. 4. How many specific error codes like InvalidPartitions and InvalidReplicas are needed? If it's not that many, giving out more specific error will be useful for non-java clients. Thanks, Jun On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and ask those for a correct isr list. Also, we should consider adding lag information to TMR for each follower for partition reassignment, as Jun suggested above. 2. Q: What if user adds different alter commands for the same topic in scope of one batch request? A: Because of the async nature of AlterTopicRequest it will be very hard then to assemble the expected (in terms of checking whether request is complete) result if we let users do this. Also it will be very confusing. It was proposed not to let users do this (probably add new Error for such cases). 3. Q: AlterTopicRequest semantics: now when we merged AlterTopic and ReassingPartitons in which order AlterTopic fields should be resolved? A: This item is not clear. There was a proposal to let user change only one thing at a time, e.g. specify either new Replicas, or ReplicaAssignment. This can be a simple solution, but it's a very strict rule. E.g. currently with TopicCommand user can increase nr of partitions and define replica assignment for newly added partitions. Taking into account item 2. this will be even harder for user to achieve this. 4. Q: Do we need such accurate errors returned from the server: InvalidArgumentPartitions, InvalidArgumentReplicas etc. A: I started implementation to add proposed error codes and now I think probably InvalidArgumentError should be sufficient. We can do simple validations on the client side (e.g. AdminClient can ensure nr of partitions argument is positive), others - which can be covered only on server (probably invalid topic config, replica assignment includes dead broker etc) - will be done on server, and in case of invalid argument we will return InvalidArgumentError without specifying the concrete field. It'd be great if we could cover these remaining issues, looks like they are minor, at least related to specific messages, not the overall protocol. - I think with that I can update confluence page and update patch to reflect all discussed items. This patch will probably include Wire protocol messages and server-side code to handle new requests. AdminClient and cli-tool implementation can be the next step. Thanks, Andrii Biletskyi On Wed, Apr 15, 2015 at 7:26 PM, Jun Rao j...@confluent.io wrote: Andrii, 500. I think what you suggested also sounds reasonable. Since ISR is only maintained accurately at the leader, TMR can return ISR if the broker is the leader of a partition. Otherwise, we can return an empty ISR. For partition reassignment, it would be useful to know the lag of each follower. Again, the leader knows this info. We can probably include that info in TMR as well. 300. I think it's probably reasonable to restrict AlterTopicRequest to change only one thing at a time, i.e., either partitions, replicas, replica assignment or config. Thanks, Jun On Mon, Apr 13, 2015 at 10:56 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 404. Great, thanks! 500. If I understand correctly KAFKA-1367 says ISR part of TMR may be inconsistent. If so, then I believe all admin commands but describeTopic are not affected. Let me emphasize that it's about AdminClient operations, not about Wire Protocol requests. What I mean: To verify AdminClient.createTopic we will need (consistent) 'topics' set from TMR (we don't need isr) To verify alterTopic - again, probably 'topics' and 'assigned replicas' + configs To verify deleteTopic - only 'topics' To verify
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Guys, Thanks for the discussion! Summary: 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata cache) can affect implementation? A: We can fix this issue for the leading broker - ReplicaManager (or Partition) component should have accurate isr list, then with leading broker having correct info, to do a describe-topic we will need to define leading brokers for partitions and ask those for a correct isr list. Also, we should consider adding lag information to TMR for each follower for partition reassignment, as Jun suggested above. 2. Q: What if user adds different alter commands for the same topic in scope of one batch request? A: Because of the async nature of AlterTopicRequest it will be very hard then to assemble the expected (in terms of checking whether request is complete) result if we let users do this. Also it will be very confusing. It was proposed not to let users do this (probably add new Error for such cases). 3. Q: AlterTopicRequest semantics: now when we merged AlterTopic and ReassingPartitons in which order AlterTopic fields should be resolved? A: This item is not clear. There was a proposal to let user change only one thing at a time, e.g. specify either new Replicas, or ReplicaAssignment. This can be a simple solution, but it's a very strict rule. E.g. currently with TopicCommand user can increase nr of partitions and define replica assignment for newly added partitions. Taking into account item 2. this will be even harder for user to achieve this. 4. Q: Do we need such accurate errors returned from the server: InvalidArgumentPartitions, InvalidArgumentReplicas etc. A: I started implementation to add proposed error codes and now I think probably InvalidArgumentError should be sufficient. We can do simple validations on the client side (e.g. AdminClient can ensure nr of partitions argument is positive), others - which can be covered only on server (probably invalid topic config, replica assignment includes dead broker etc) - will be done on server, and in case of invalid argument we will return InvalidArgumentError without specifying the concrete field. It'd be great if we could cover these remaining issues, looks like they are minor, at least related to specific messages, not the overall protocol. - I think with that I can update confluence page and update patch to reflect all discussed items. This patch will probably include Wire protocol messages and server-side code to handle new requests. AdminClient and cli-tool implementation can be the next step. Thanks, Andrii Biletskyi On Wed, Apr 15, 2015 at 7:26 PM, Jun Rao j...@confluent.io wrote: Andrii, 500. I think what you suggested also sounds reasonable. Since ISR is only maintained accurately at the leader, TMR can return ISR if the broker is the leader of a partition. Otherwise, we can return an empty ISR. For partition reassignment, it would be useful to know the lag of each follower. Again, the leader knows this info. We can probably include that info in TMR as well. 300. I think it's probably reasonable to restrict AlterTopicRequest to change only one thing at a time, i.e., either partitions, replicas, replica assignment or config. Thanks, Jun On Mon, Apr 13, 2015 at 10:56 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, 404. Great, thanks! 500. If I understand correctly KAFKA-1367 says ISR part of TMR may be inconsistent. If so, then I believe all admin commands but describeTopic are not affected. Let me emphasize that it's about AdminClient operations, not about Wire Protocol requests. What I mean: To verify AdminClient.createTopic we will need (consistent) 'topics' set from TMR (we don't need isr) To verify alterTopic - again, probably 'topics' and 'assigned replicas' + configs To verify deleteTopic - only 'topics' To verify preferredReplica - 'leader', 'assigned replicas' To verify reassignPartitions - 'assigned replicas' ? (I'm not sure about this one) If everything above is correct, then AdminClient.describeTopic is the only command under risk. We can actually workaround it - find out the leader broker and ask TMR that leading broker to get up-to-date isr list. Bottom line: looks like 1367 is a separate issue, and is not a blocker for this KIP. I'm a bit concerned about adding new requests as a must-have part of this KIP when we don't know what we want to include to those requests. Also, I'd like to write down the new AlterTopicRequest semantics (if we decide to include replicas there and merge it with ReassignPartitionsRequest) 300. AlterTopicRequest = [TopicName Partitions Replicas ReplicaAssignment [AddedConfigEntry] [DeletedConfig]] The fields are resolved in this sequence: 1. Either partition or replicas
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Jun, 404. Great, thanks! 500. If I understand correctly KAFKA-1367 says ISR part of TMR may be inconsistent. If so, then I believe all admin commands but describeTopic are not affected. Let me emphasize that it's about AdminClient operations, not about Wire Protocol requests. What I mean: To verify AdminClient.createTopic we will need (consistent) 'topics' set from TMR (we don't need isr) To verify alterTopic - again, probably 'topics' and 'assigned replicas' + configs To verify deleteTopic - only 'topics' To verify preferredReplica - 'leader', 'assigned replicas' To verify reassignPartitions - 'assigned replicas' ? (I'm not sure about this one) If everything above is correct, then AdminClient.describeTopic is the only command under risk. We can actually workaround it - find out the leader broker and ask TMR that leading broker to get up-to-date isr list. Bottom line: looks like 1367 is a separate issue, and is not a blocker for this KIP. I'm a bit concerned about adding new requests as a must-have part of this KIP when we don't know what we want to include to those requests. Also, I'd like to write down the new AlterTopicRequest semantics (if we decide to include replicas there and merge it with ReassignPartitionsRequest) 300. AlterTopicRequest = [TopicName Partitions Replicas ReplicaAssignment [AddedConfigEntry] [DeletedConfig]] The fields are resolved in this sequence: 1. Either partition or replicas is defined: ---1.1. ReplicaAssignment is not defined - generate automatic replica assignment for newly added partitions or for replicas parameter increased ---1.2. ReplicaAssignment is defined - increase topic partitions if 'partitions' defined, reassign partitions according to ReplicaAssignment 2. Neither partition nor replicas is defined: ---2.1. ReplicaAssignment is defined - it's a reassign replicas request ---2.2. ReplicaAssignment is not defined - just change topic configs 3. Config fields are handled always and independently from partitions+replicas/replicaAssingment A bit sophisticated, but should cover all cases. Another option - we can say you can define either partitions+replicas or replicaAssignment. 300.5. There is also a new question related to AlterTopicRequest - should we allow users multiple alter-topic instructions for one topic in one batch? I think if we go this way, user will expect we optimize and group requests for one topic, but it will add a lot of burden, especially taken into account async semantics of the AlterTopicRequest. I'd rather return some error code, or ignore all but first. Thoughts? Thanks, Andrii Biletskyi On Mon, Apr 13, 2015 at 6:34 AM, Jun Rao j...@confluent.io wrote: Andrii, 404. Jay and I chatted a bit. We agreed to leave createTopicRequest as async for now. There is another thing. 500. Currently, we have this issue (KAFKA-1367) that the ISR in the metadata cache can be out of sync. The reason is that ISR is really maintained at the leader. We can potentially add a new BrokerMetaRequest, which will return useful stats specific to a broker. Such stats can include (1) for each partition whose leader is on this broker, the ISR and the lag (in messages) for each of the followers, (2) space used per partition, (3) remaining space per log dir (not sure how easy it is to get this info). If we have this new request, we can probably remove the ISR part from TMR v1. Currently, the producer/consumer client don't really care about ISR. The admin client will then issue BrokerMetaRequest to find out ISR and other stats. Thanks, Jun On Tue, Apr 7, 2015 at 12:10 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, A summary of our discussion: 201. Q: Cluster updates in backward compatible way. A: Add topicConfigs map property and change constructor, this shouldn't break Consumer/Producer since TMR is used in NetworkClient, not directly by Consumer/Producer. 300. Q: Can we merge AlterTopic and ReassignPartitions requests? A: It looks like in terms of Wire Protocol partition reassignment can be just an application of AlterTopicRequest. On the AdminClient side we can split this into two separate methods, if needed. Some additional items that were added today: 400. Q: Do we need ListTopicsRequest, we can use TMR for this purpose. A: The answer depends on whether we can leverage ListTopics in consumer/producer, because the only benefit of the ListTopics is performance optimization, otherwise it doesn't worth it. 401. Q: AdminClient.topicExists - do we need it? A: AdminClient.listTopics should be sufficient. 402. Review AdminClient API and use separate objects instead of collections for methods arguments / return results (e.g. preferredReplica accepts MapString, ListInt might be better to add separate java object) 403. Error number in KIP-4 (100x). Currently there are no dedicated ranges for errors, we will probably
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Andrii, 404. Jay and I chatted a bit. We agreed to leave createTopicRequest as async for now. There is another thing. 500. Currently, we have this issue (KAFKA-1367) that the ISR in the metadata cache can be out of sync. The reason is that ISR is really maintained at the leader. We can potentially add a new BrokerMetaRequest, which will return useful stats specific to a broker. Such stats can include (1) for each partition whose leader is on this broker, the ISR and the lag (in messages) for each of the followers, (2) space used per partition, (3) remaining space per log dir (not sure how easy it is to get this info). If we have this new request, we can probably remove the ISR part from TMR v1. Currently, the producer/consumer client don't really care about ISR. The admin client will then issue BrokerMetaRequest to find out ISR and other stats. Thanks, Jun On Tue, Apr 7, 2015 at 12:10 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, A summary of our discussion: 201. Q: Cluster updates in backward compatible way. A: Add topicConfigs map property and change constructor, this shouldn't break Consumer/Producer since TMR is used in NetworkClient, not directly by Consumer/Producer. 300. Q: Can we merge AlterTopic and ReassignPartitions requests? A: It looks like in terms of Wire Protocol partition reassignment can be just an application of AlterTopicRequest. On the AdminClient side we can split this into two separate methods, if needed. Some additional items that were added today: 400. Q: Do we need ListTopicsRequest, we can use TMR for this purpose. A: The answer depends on whether we can leverage ListTopics in consumer/producer, because the only benefit of the ListTopics is performance optimization, otherwise it doesn't worth it. 401. Q: AdminClient.topicExists - do we need it? A: AdminClient.listTopics should be sufficient. 402. Review AdminClient API and use separate objects instead of collections for methods arguments / return results (e.g. preferredReplica accepts MapString, ListInt might be better to add separate java object) 403. Error number in KIP-4 (100x). Currently there are no dedicated ranges for errors, we will probably continue doing it this way. 404. There were some concerns again about the asynchronous semantics of the admin requests. Jun and Jay to agree separately how we want to handle it. Please add / correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Apr 7, 2015 at 4:11 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I wasn't able to send email to our thread (it says we exceeded message size limit :)). So I'm starting the new one. Jun, Thanks again for the review. Answering your comments: 201. I'm not sure I understand how can we evolve Cluster in backward compatible way. In my understanding topic configs are not returned currently - in TMR_V0. Thus we need to add new property in Cluster - smth like private final MapString, ListConfigEntry topicConfigs; Which affects Cluster constructor, which is used in MetadataResponse.java - not sure whether we can change Cluster this way so it's backward compatible, I suppose - no. Let me know if I'm missing something... 300. Hm, so you propose to give up ReassignPartition as a separate command? That's interesting, let's discuss it today in detail. Two small points here: 1) afaik currently replica-assignment argument in alter-topic (from TopicCommand) doesn't reassign partitions, it lets users specify replica assignment for newly added partition (AddPartitionsListener) 2) ReassignPartitions command involves a little bit more than just changing replica assignment in zk. People are struggling with partition reassignment so I think it's good to have explicit request for it so we can handle it independently, also as mentioned earlier we'll probably add in future some better status check procedure for this long-running request. 301. Good point. We also agreed to use clientId as an identifier for the requester - whether it's a producer client or admin. I think we can go with -1/-1 approach. 302. Again, as said above replica-assignment in alter-topic doesn't change replica assignment of the existing partitions. But we can think about it in general - how can we change topic replication factor? The easy way - we don't need it, we can use reassign partitions. Not sure whether we want to add special logic to treat this case... 303.1. Okay, sure, I'll generalize topicExists(). 303.2. I think, yes, we need separate verify methods as a status check procedure, because respective requests are long running, and CLI user potentially will asynchronously call reassign-partitions, do other stuff (e.g. create topics) periodically checking status of the partition reassignment. Anyway we'll have to implement this logic
[DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Hi all, I wasn't able to send email to our thread (it says we exceeded message size limit :)). So I'm starting the new one. Jun, Thanks again for the review. Answering your comments: 201. I'm not sure I understand how can we evolve Cluster in backward compatible way. In my understanding topic configs are not returned currently - in TMR_V0. Thus we need to add new property in Cluster - smth like private final MapString, ListConfigEntry topicConfigs; Which affects Cluster constructor, which is used in MetadataResponse.java - not sure whether we can change Cluster this way so it's backward compatible, I suppose - no. Let me know if I'm missing something... 300. Hm, so you propose to give up ReassignPartition as a separate command? That's interesting, let's discuss it today in detail. Two small points here: 1) afaik currently replica-assignment argument in alter-topic (from TopicCommand) doesn't reassign partitions, it lets users specify replica assignment for newly added partition (AddPartitionsListener) 2) ReassignPartitions command involves a little bit more than just changing replica assignment in zk. People are struggling with partition reassignment so I think it's good to have explicit request for it so we can handle it independently, also as mentioned earlier we'll probably add in future some better status check procedure for this long-running request. 301. Good point. We also agreed to use clientId as an identifier for the requester - whether it's a producer client or admin. I think we can go with -1/-1 approach. 302. Again, as said above replica-assignment in alter-topic doesn't change replica assignment of the existing partitions. But we can think about it in general - how can we change topic replication factor? The easy way - we don't need it, we can use reassign partitions. Not sure whether we want to add special logic to treat this case... 303.1. Okay, sure, I'll generalize topicExists(). 303.2. I think, yes, we need separate verify methods as a status check procedure, because respective requests are long running, and CLI user potentially will asynchronously call reassign-partitions, do other stuff (e.g. create topics) periodically checking status of the partition reassignment. Anyway we'll have to implement this logic because it's a criterion of the completed Future of the reassign partitions async call, we'll to have make those methods just public. 303.3. If preferredReplica returns FutureMapString, Errors than what is an error in terms of preferred replica leader election? As I understand we can only check whether it has succeeded (leader == AR.head) or not _yet_. 304.1. Sure, let's add timeout to reassign/preferred replica. 304.2. This can be finalized after we discuss 300. 305. Misprints - thanks, fixed. Thanks, Andrii Biletskyi
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)
Hi all, A summary of our discussion: 201. Q: Cluster updates in backward compatible way. A: Add topicConfigs map property and change constructor, this shouldn't break Consumer/Producer since TMR is used in NetworkClient, not directly by Consumer/Producer. 300. Q: Can we merge AlterTopic and ReassignPartitions requests? A: It looks like in terms of Wire Protocol partition reassignment can be just an application of AlterTopicRequest. On the AdminClient side we can split this into two separate methods, if needed. Some additional items that were added today: 400. Q: Do we need ListTopicsRequest, we can use TMR for this purpose. A: The answer depends on whether we can leverage ListTopics in consumer/producer, because the only benefit of the ListTopics is performance optimization, otherwise it doesn't worth it. 401. Q: AdminClient.topicExists - do we need it? A: AdminClient.listTopics should be sufficient. 402. Review AdminClient API and use separate objects instead of collections for methods arguments / return results (e.g. preferredReplica accepts MapString, ListInt might be better to add separate java object) 403. Error number in KIP-4 (100x). Currently there are no dedicated ranges for errors, we will probably continue doing it this way. 404. There were some concerns again about the asynchronous semantics of the admin requests. Jun and Jay to agree separately how we want to handle it. Please add / correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Apr 7, 2015 at 4:11 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I wasn't able to send email to our thread (it says we exceeded message size limit :)). So I'm starting the new one. Jun, Thanks again for the review. Answering your comments: 201. I'm not sure I understand how can we evolve Cluster in backward compatible way. In my understanding topic configs are not returned currently - in TMR_V0. Thus we need to add new property in Cluster - smth like private final MapString, ListConfigEntry topicConfigs; Which affects Cluster constructor, which is used in MetadataResponse.java - not sure whether we can change Cluster this way so it's backward compatible, I suppose - no. Let me know if I'm missing something... 300. Hm, so you propose to give up ReassignPartition as a separate command? That's interesting, let's discuss it today in detail. Two small points here: 1) afaik currently replica-assignment argument in alter-topic (from TopicCommand) doesn't reassign partitions, it lets users specify replica assignment for newly added partition (AddPartitionsListener) 2) ReassignPartitions command involves a little bit more than just changing replica assignment in zk. People are struggling with partition reassignment so I think it's good to have explicit request for it so we can handle it independently, also as mentioned earlier we'll probably add in future some better status check procedure for this long-running request. 301. Good point. We also agreed to use clientId as an identifier for the requester - whether it's a producer client or admin. I think we can go with -1/-1 approach. 302. Again, as said above replica-assignment in alter-topic doesn't change replica assignment of the existing partitions. But we can think about it in general - how can we change topic replication factor? The easy way - we don't need it, we can use reassign partitions. Not sure whether we want to add special logic to treat this case... 303.1. Okay, sure, I'll generalize topicExists(). 303.2. I think, yes, we need separate verify methods as a status check procedure, because respective requests are long running, and CLI user potentially will asynchronously call reassign-partitions, do other stuff (e.g. create topics) periodically checking status of the partition reassignment. Anyway we'll have to implement this logic because it's a criterion of the completed Future of the reassign partitions async call, we'll to have make those methods just public. 303.3. If preferredReplica returns FutureMapString, Errors than what is an error in terms of preferred replica leader election? As I understand we can only check whether it has succeeded (leader == AR.head) or not _yet_. 304.1. Sure, let's add timeout to reassign/preferred replica. 304.2. This can be finalized after we discuss 300. 305. Misprints - thanks, fixed. Thanks, Andrii Biletskyi
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hm, actually the ticket you linked, Guozhang, brings as back to the problem what should be considered a post-condition for each of the admin commands. In my understanding: 1) CreateTopic - broker created /brokers/topics/topic (Not the controller picked up changes from zk and broadcasted LeaderAndIsr and UpdateMetadata) 2) AlterTopic - same as 1) - broker changed assignment data in zookeeper or created admin path for topic config change 3) DeleteTopic - admin path /admin/delete_topics is created 4) ReassignPartitions and PreferredReplica - corresponding admin path is created Now what can be considered a completed operation from the client's perspective? 1) Topic is created once corresponding data is in zk (I remember there were some thoughts that it'd be good to consider topic created once all replicas receive information about it and thus clients can produce/consume from it, but as was discussed this seems to be a hard thing to do) 2) Probably same as 1), so right after AlterTopic is issued 3) The topic has been removed from /brokers/topics 4) ReassignPartitions and PrefferedReplica were discussed earlier - in short the former is completed once partition state info in zk matches reassignment request and admin path is empty, the latter - once data in zk shows that head of assignned replicas of the partition and leader is the same replica Thoughts? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 7:10 PM, Guozhang Wang wangg...@gmail.com wrote: I think while loop is fine for supporting blocking, just that we need to add back off to avoid bombarding brokers with DescribeTopic requests. Also I have linked KAFKA-1125 https://issues.apache.org/jira/browse/KAFKA-1125 to your proposal, and when KAFKA-1694 is done this ticket can also be closed. Guozhang On Fri, Mar 20, 2015 at 9:41 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send DescribeTopicRequest and break the loop once all topics are returned in response (or upon timeout). 3. if unsuccessful throw exception Would it be okay? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 6:21 PM, Jun Rao j...@confluent.io wrote: Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send DescribeTopicRequest and break the loop once all topics are returned in response (or upon timeout). 3. if unsuccessful throw exception Would it be okay? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 6:21 PM, Jun Rao j...@confluent.io wrote: Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I think while loop is fine for supporting blocking, just that we need to add back off to avoid bombarding brokers with DescribeTopic requests. Also I have linked KAFKA-1125 https://issues.apache.org/jira/browse/KAFKA-1125 to your proposal, and when KAFKA-1694 is done this ticket can also be closed. Guozhang On Fri, Mar 20, 2015 at 9:41 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send DescribeTopicRequest and break the loop once all topics are returned in response (or upon timeout). 3. if unsuccessful throw exception Would it be okay? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 6:21 PM, Jun Rao j...@confluent.io wrote: Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
For 1), 2) and 3), blocking would probably mean that the new metadata is propagated to every broker. To achieve that, the client can keep issuing the describe topic request to every broker until it sees the new metadata in the response. Thanks, Jun On Fri, Mar 20, 2015 at 12:16 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hm, actually the ticket you linked, Guozhang, brings as back to the problem what should be considered a post-condition for each of the admin commands. In my understanding: 1) CreateTopic - broker created /brokers/topics/topic (Not the controller picked up changes from zk and broadcasted LeaderAndIsr and UpdateMetadata) 2) AlterTopic - same as 1) - broker changed assignment data in zookeeper or created admin path for topic config change 3) DeleteTopic - admin path /admin/delete_topics is created 4) ReassignPartitions and PreferredReplica - corresponding admin path is created Now what can be considered a completed operation from the client's perspective? 1) Topic is created once corresponding data is in zk (I remember there were some thoughts that it'd be good to consider topic created once all replicas receive information about it and thus clients can produce/consume from it, but as was discussed this seems to be a hard thing to do) 2) Probably same as 1), so right after AlterTopic is issued 3) The topic has been removed from /brokers/topics 4) ReassignPartitions and PrefferedReplica were discussed earlier - in short the former is completed once partition state info in zk matches reassignment request and admin path is empty, the latter - once data in zk shows that head of assignned replicas of the partition and leader is the same replica Thoughts? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 7:10 PM, Guozhang Wang wangg...@gmail.com wrote: I think while loop is fine for supporting blocking, just that we need to add back off to avoid bombarding brokers with DescribeTopic requests. Also I have linked KAFKA-1125 https://issues.apache.org/jira/browse/KAFKA-1125 to your proposal, and when KAFKA-1694 is done this ticket can also be closed. Guozhang On Fri, Mar 20, 2015 at 9:41 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send DescribeTopicRequest and break the loop once all topics are returned in response (or upon timeout). 3. if unsuccessful throw exception Would it be okay? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 6:21 PM, Jun Rao j...@confluent.io wrote: Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote:
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jun, I see your point. But wouldn't that lead to a fat client implementations? Suppose someone would like to implement client for Admin Wire protocol. Not only people will have to code quite complicated logic like send describe request to each broker (again state machin?) but it will also mean people must understand internal kafka logic related to topic storage and how information is propageted from the controller to brokers. I see this like a dilemma between having a concise Wire Protocol and self-sufficient API to make client implementations simple. I don't have a win-win solution though. Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 11:19 PM, Jun Rao j...@confluent.io wrote: For 1), 2) and 3), blocking would probably mean that the new metadata is propagated to every broker. To achieve that, the client can keep issuing the describe topic request to every broker until it sees the new metadata in the response. Thanks, Jun On Fri, Mar 20, 2015 at 12:16 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hm, actually the ticket you linked, Guozhang, brings as back to the problem what should be considered a post-condition for each of the admin commands. In my understanding: 1) CreateTopic - broker created /brokers/topics/topic (Not the controller picked up changes from zk and broadcasted LeaderAndIsr and UpdateMetadata) 2) AlterTopic - same as 1) - broker changed assignment data in zookeeper or created admin path for topic config change 3) DeleteTopic - admin path /admin/delete_topics is created 4) ReassignPartitions and PreferredReplica - corresponding admin path is created Now what can be considered a completed operation from the client's perspective? 1) Topic is created once corresponding data is in zk (I remember there were some thoughts that it'd be good to consider topic created once all replicas receive information about it and thus clients can produce/consume from it, but as was discussed this seems to be a hard thing to do) 2) Probably same as 1), so right after AlterTopic is issued 3) The topic has been removed from /brokers/topics 4) ReassignPartitions and PrefferedReplica were discussed earlier - in short the former is completed once partition state info in zk matches reassignment request and admin path is empty, the latter - once data in zk shows that head of assignned replicas of the partition and leader is the same replica Thoughts? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 7:10 PM, Guozhang Wang wangg...@gmail.com wrote: I think while loop is fine for supporting blocking, just that we need to add back off to avoid bombarding brokers with DescribeTopic requests. Also I have linked KAFKA-1125 https://issues.apache.org/jira/browse/KAFKA-1125 to your proposal, and when KAFKA-1694 is done this ticket can also be closed. Guozhang On Fri, Mar 20, 2015 at 9:41 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send DescribeTopicRequest and break the loop once all topics are returned in response (or upon timeout). 3. if unsuccessful throw exception Would it be okay? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 6:21 PM, Jun Rao j...@confluent.io wrote: Andrii, I think you are right. It seems that only ReassignPartitions needs a separate verification request. Thanks, Jun On Thu, Mar 19, 2015 at 9:22 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, I like this idea too. Let's stick with that. I'll update KIP accordingly. I was also thinking we can avoid adding dedicated status check requests for topic commands. - We have everything in DescribeTopic for that! E.g.: User issued CreateTopic - to check the status client sends DescribeTopic and checks whether is something returned for that topic. The same for alteration, deletion. Btw, PreferredReplica status can be also checked with DescribeTopicRequest (head of assigned replicas list == leader). For ReassignPartitions as discussed we'll need to have a separate Verify... request. Thanks, Andrii Biletskyi On Thu, Mar 19, 2015 at 6:03 PM, Guozhang Wang wangg...@gmail.com wrote: +1 on broker writing to ZK for async handling. I was thinking that in the end state the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andrii, A few points. 1. Create/Alter can typically complete quickly. So, it's possible to make the request block until it's completed. However, currently, doing this at the broker is a bit involved. To make Create block, we will need to add some callbacks in KafkaController. This is possible. However, the controller logic currently is pretty completed. It would probably be better if we clean it up first before adding more complexity to it. Alter is even trickier. Adding partition is currently handled through KafkaController. So it can be dealt with in a similar way. However, Alter config is done completely differently. It doesn't go through the controller. Instead, each broker listens to ZooKeeper directly. So, it's not clear if there is an easy way on the broker to figure out whether a config is applied on every broker. 2. Delete can potentially take long if a replica to be deleted is offline. PreferredLeader/PartitionReassign can also take long. So, we can't really make those requests block on the broker. As you can see, at this moment it's not easy to make all admin requests block on the broker. So, if we want the blocking feature in the admin utility in the short term, doing the completion check at the admin client is probably an easier route, even though it may not be ideal. Thanks, Jun On Fri, Mar 20, 2015 at 2:38 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I see your point. But wouldn't that lead to a fat client implementations? Suppose someone would like to implement client for Admin Wire protocol. Not only people will have to code quite complicated logic like send describe request to each broker (again state machin?) but it will also mean people must understand internal kafka logic related to topic storage and how information is propageted from the controller to brokers. I see this like a dilemma between having a concise Wire Protocol and self-sufficient API to make client implementations simple. I don't have a win-win solution though. Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 11:19 PM, Jun Rao j...@confluent.io wrote: For 1), 2) and 3), blocking would probably mean that the new metadata is propagated to every broker. To achieve that, the client can keep issuing the describe topic request to every broker until it sees the new metadata in the response. Thanks, Jun On Fri, Mar 20, 2015 at 12:16 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hm, actually the ticket you linked, Guozhang, brings as back to the problem what should be considered a post-condition for each of the admin commands. In my understanding: 1) CreateTopic - broker created /brokers/topics/topic (Not the controller picked up changes from zk and broadcasted LeaderAndIsr and UpdateMetadata) 2) AlterTopic - same as 1) - broker changed assignment data in zookeeper or created admin path for topic config change 3) DeleteTopic - admin path /admin/delete_topics is created 4) ReassignPartitions and PreferredReplica - corresponding admin path is created Now what can be considered a completed operation from the client's perspective? 1) Topic is created once corresponding data is in zk (I remember there were some thoughts that it'd be good to consider topic created once all replicas receive information about it and thus clients can produce/consume from it, but as was discussed this seems to be a hard thing to do) 2) Probably same as 1), so right after AlterTopic is issued 3) The topic has been removed from /brokers/topics 4) ReassignPartitions and PrefferedReplica were discussed earlier - in short the former is completed once partition state info in zk matches reassignment request and admin path is empty, the latter - once data in zk shows that head of assignned replicas of the partition and leader is the same replica Thoughts? Thanks, Andrii Biletskyi On Fri, Mar 20, 2015 at 7:10 PM, Guozhang Wang wangg...@gmail.com wrote: I think while loop is fine for supporting blocking, just that we need to add back off to avoid bombarding brokers with DescribeTopic requests. Also I have linked KAFKA-1125 https://issues.apache.org/jira/browse/KAFKA-1125 to your proposal, and when KAFKA-1694 is done this ticket can also be closed. Guozhang On Fri, Mar 20, 2015 at 9:41 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Great. I want to elaborate this a bit more, to see we are on the same page concerning the client code. So with all topic commands being async a client (AdminClient in our case or any other other client people would like to implement) to support a blocking operation (which seems to be a natural use-case e.g. for topic creation): would have to do: 1. issue CreateTopicRequest 2. if successful, in a while loop send
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
+1 on broker writing to ZK for async handling. I was thinking that in the end state the admin requests would be eventually sent to controller either through re-routing or clients discovering them, instead of letting controller listen on ZK admin path. But thinking about it a second time, I think it is actually simpler to let controller manage incoming queued-up admin requests through ZK. Guozhang On Wed, Mar 18, 2015 at 4:16 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a separate jar so shouldn't be a big deal. 7. Q: VerifyReassingPartitions vs generic status check command. A: For long-running requests like reassign partitions *progress* check request is useful, it makes sense to introduce it. Please add, correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 6:20 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition details) But since current TMR is involved in lots of places (including network client, as I understand) this might be very serious change and it probably makes sense to stick with current approach. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 5:29 PM, Joel Koshy jjkosh...@gmail.com wrote: I may be missing some context but hopefully this will also be covered today: I thought the earlier proposal where there was an explicit ClusterMetadata request was clearer and explicit. During the course of this thread I think the conclusion was that the main need was for controller information and that can be rolled into the topic metadata response but that seems a bit irrelevant to topic metadata. FWIW I think the full broker-list is also irrelevant to topic metadata, but it is already there and in use. I think there is still room for an explicit ClusterMetadata request since there may be other cluster-level information that we may want to add over time (and that have nothing to do with topic metadata). On Tue, Mar 17, 2015 at 02:45:30PM +0200, Andrii Biletskyi wrote: Jun, 101. Okay, if you say that such use case is important. I also think using clientId for these purposes is fine - if we already have this
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
On Wed, Mar 18, 2015 at 9:34 AM, Jun Rao j...@confluent.io wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. I mentioned this in a separate thread, but it may be more relevant here: It looks like the SimpleConsumer API exposes TopicMetadataRequest and TopicMetadataResponse. This means that KAFKA-1927 doesn't remove this duplication. So I'm not sure we actually need KAFKA-1927 before implementing this KIP. This doesn't mean I'm stopping work on KAFKA-1927, but perhaps it means we can proceed in parallel? 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a separate jar so shouldn't be a big deal. 7. Q: VerifyReassingPartitions vs generic status check command. A: For long-running requests like reassign partitions *progress* check request is useful, it makes sense to introduce it. Please add, correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 6:20 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition details) But since current TMR is involved in lots of places (including network client, as I understand) this might be very serious change and it probably makes sense to stick with current approach. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 5:29 PM, Joel Koshy jjkosh...@gmail.com wrote: I may be missing some context but hopefully this will also be covered today: I thought the earlier proposal where there was an explicit ClusterMetadata request was clearer and explicit. During the course of this thread I think the conclusion was that the main need was for controller information and that can be rolled into the topic metadata response but that seems a bit irrelevant to topic metadata. FWIW I think the full broker-list is also irrelevant to
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Joel, I'm totally behind your arguments concerning adding irrelevant stuff to TopicMetadataRequest. And also about having a bloated request. Personally I'd go with a separate ClusterMetadataRequest (CMR), actually this was our initial proposal. But since the second part of the request - brokers is already present in TopicMetadataResponse (TMR) I agreed to augment TMR instead of introducing a separate request. The only thing which should be considered though is kafka producer / consumer. If we split TMR to topic metadata and cluster metadata (brokers + controller) we need to think whether it's okay if clients would have to issue two separate requests to maintain Metadata.java (in terms of potential concurrency issue). Can someone please clarify this question? Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 8:58 PM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
+1 as well. I think it helps to keep the rerouting approach orthogonal to this KIP. On Wed, Mar 18, 2015 at 03:40:48PM -0700, Jay Kreps wrote: I'm +1 on Jun's suggestion as long as it can work for all the requests. On Wed, Mar 18, 2015 at 3:35 PM, Jun Rao j...@confluent.io wrote: Andrii, I think we agreed on the following. (a) Admin requests can be sent to and handled by any broker. (b) Admin requests are processed asynchronously, at least for now. That is, when the client gets a response, it just means that the request is initiated, but not necessarily completed. Then, it's up to the client to issue another request to check the status for completion. To support (a), we were thinking of doing request forwarding to the controller (utilizing KAFKA-1912). I am making an alternative proposal. Basically, the broker can just write to ZooKeeper to inform the controller about the request. For example, to handle partitionReassignment, the broker will just write the requested partitions to /admin/reassign_partitions (like what AdminUtils currently does) and then send a response to the client. This shouldn't take long and the implementation will be simpler than forwarding the requests to the controller through RPC. Thanks, Jun On Wed, Mar 18, 2015 at 3:03 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Yes that is what I was alluding to when I said that if we finally do request rerouting in Kafka then the field would add little to no value. I wasn't sure if we agreed that we _will_ do rerouting or whether we agreed to evaluate it (KAFKA-1912). Andrii can you update the KIP with this? Thanks, Joel On Wed, Mar 18, 2015 at 02:55:00PM -0700, Jun Rao wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks,
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Gwen, Yes, looks like KAFKA-1927 will leave TopicMetadataRequest/Response. But I believe, KIP is still tightly related with KAFKA-1927 since we are not only going to update TopicMetadataRequest there but we will introduce a bunch of new requests too. And it probably makes sense to do those correctly from scratch - without introducing scala request objects. As I understand you'll have this common infrastructure code done in KAFKA-1927. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 8:38 PM, Gwen Shapira gshap...@cloudera.com wrote: On Wed, Mar 18, 2015 at 9:34 AM, Jun Rao j...@confluent.io wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. I mentioned this in a separate thread, but it may be more relevant here: It looks like the SimpleConsumer API exposes TopicMetadataRequest and TopicMetadataResponse. This means that KAFKA-1927 doesn't remove this duplication. So I'm not sure we actually need KAFKA-1927 before implementing this KIP. This doesn't mean I'm stopping work on KAFKA-1927, but perhaps it means we can proceed in parallel? 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a separate jar so shouldn't be a big deal. 7. Q: VerifyReassingPartitions vs generic status check command. A: For long-running requests like reassign partitions *progress* check request is useful, it makes sense to introduce it. Please add, correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 6:20 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition details) But since current TMR is involved in lots of places (including network client, as I understand) this might be very
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
(Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Got it. Thanks for clarifying! On Wed, Mar 18, 2015 at 11:54 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Gwen, Yes, looks like KAFKA-1927 will leave TopicMetadataRequest/Response. But I believe, KIP is still tightly related with KAFKA-1927 since we are not only going to update TopicMetadataRequest there but we will introduce a bunch of new requests too. And it probably makes sense to do those correctly from scratch - without introducing scala request objects. As I understand you'll have this common infrastructure code done in KAFKA-1927. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 8:38 PM, Gwen Shapira gshap...@cloudera.com wrote: On Wed, Mar 18, 2015 at 9:34 AM, Jun Rao j...@confluent.io wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. I mentioned this in a separate thread, but it may be more relevant here: It looks like the SimpleConsumer API exposes TopicMetadataRequest and TopicMetadataResponse. This means that KAFKA-1927 doesn't remove this duplication. So I'm not sure we actually need KAFKA-1927 before implementing this KIP. This doesn't mean I'm stopping work on KAFKA-1927, but perhaps it means we can proceed in parallel? 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a separate jar so shouldn't be a big deal. 7. Q: VerifyReassingPartitions vs generic status check command. A: For long-running requests like reassign partitions *progress* check request is useful, it makes sense to introduce it. Please add, correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 6:20 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun On Tue, Mar 17, 2015 at 1:13 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jun, I might be wrong but didn't we agree we will let any broker from the cluster handle *long-running* admin requests (at this time preferredReplica and reassignPartitions), via zk admin path. Thus CreateTopics etc should be sent only to the controller. Thanks, Andrii Biletskyi On Wed, Mar 18, 2015 at 11:55 PM, Jun Rao j...@confluent.io wrote: Joel, Andril, I think we agreed that those admin requests can be issued to any broker. Because of that, there doesn't seem to be a strong need to know the controller. So, perhaps we can proceed by not making any change to the format of TMR right now. When we start using create topic request in the producer, we will need a new version of TMR that doesn't trigger auto topic creation. But that can be done later. As a first cut implementation, I think the broker can just write to ZK directly for createToipic/alterTopic/reassignPartitions/preferredLeaderElection requests, instead of forwarding them to the controller. This will simplify the implementation on the broker side. Thanks, Jun On Wed, Mar 18, 2015 at 11:58 AM, Joel Koshy jjkosh...@gmail.com wrote: (Thanks Andrii for the summary) For (1) yes we will circle back on that shortly after syncing up in person. I think it is close to getting committed although development for KAFKA-1927 can probably begin without it. There is one more item we covered at the hangout. i.e., whether we want to add the coordinator to the topic metadata response or provide a clearer ClusterMetadataRequest. There are two reasons I think we should try and avoid adding the field: - It is irrelevant to topic metadata - If we finally do request rerouting in Kafka then the field would add little to no value. (It still helps to have a separate ClusterMetadataRequest to query for cluster-wide information such as 'which broker is the controller?' as Joe mentioned.) I think it would be cleaner to have an explicit ClusterMetadataRequest that you can send to any broker in order to obtain the controller (and in the future possibly other cluster-wide information). I think the main argument against doing this and instead adding it to the topic metadata response was convenience - i.e., you don't have to discover the controller in advance. However, I don't see much actual benefit/convenience in this and in fact think it is a non-issue. Let me know if I'm overlooking something here. As an example, say we need to initiate partition reassignment by issuing the new ReassignPartitionsRequest to the controller (assume we already have the desired manual partition assignment). If we are to augment topic metadata response then the flow be something like this : - Issue topic metadata request to any broker (and discover the controller - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request to the controller. With an explicit cluster metadata request it would be: - Issue cluster metadata request to any broker - Connect to controller if required (i.e., if the broker above != controller) - Issue the partition reassignment request So it seems to add little practical value and bloats topic metadata response with an irrelevant detail. The other angle to this is the following - is it a matter of naming? Should we just rename topic metadata request/response to just MetadataRequest/Response and add cluster metadata to it? By that same token should we also allow querying for the consumer coordinator (and in future transaction coordinator) as well? This leads to a bloated request which isn't very appealing and altogether confusing. Thanks, Joel On Wed, Mar 18, 2015 at 09:34:12AM -0700, Jun Rao wrote: Andri, Thanks for the summary. 1. I just realized that in order to start working on KAFKA-1927, we will need to merge the changes to OffsetCommitRequest (from 0.8.2) to trunk. This is planned to be done as part of KAFKA-1634. So, we will need Guozhang and Joel's help to wrap this up. 2. Thinking about this a bit more, if the semantic of those write requests is async (i.e., after the client gets a response, it just means that the operation is initiated, but not necessarily completed), we don't really need to forward the requests to the controller. Instead, the receiving broker can just write the operation to ZK as the admin command line tool previously does. This will simplify the implementation. 8. There is another implementation detail for describe topic. Ideally, we want to read the topic config from the broker cache, instead of ZooKeeper. Currently, every broker reads the topic-level config for all topics. However, it ignores those for topics not hosted on itself. So, we may need to change TopicConfigManager a bit so that it caches the configs for all topics. Thanks, Jun
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I may be missing some context but hopefully this will also be covered today: I thought the earlier proposal where there was an explicit ClusterMetadata request was clearer and explicit. During the course of this thread I think the conclusion was that the main need was for controller information and that can be rolled into the topic metadata response but that seems a bit irrelevant to topic metadata. FWIW I think the full broker-list is also irrelevant to topic metadata, but it is already there and in use. I think there is still room for an explicit ClusterMetadata request since there may be other cluster-level information that we may want to add over time (and that have nothing to do with topic metadata). On Tue, Mar 17, 2015 at 02:45:30PM +0200, Andrii Biletskyi wrote: Jun, 101. Okay, if you say that such use case is important. I also think using clientId for these purposes is fine - if we already have this field as part of all Wire protocol messages, why not use that. I will update KIP-4 page if nobody has other ideas (which may come up during the call today). 102.1 Agree, I'll update the KIP accordingly. I think we can add new, fine-grained error codes if some error code received in specific case won't give enough context to return a descriptive error message for user. Look forward to discussing all outstanding issues in detail today during the call. Thanks, Andrii Biletskyi On Mon, Mar 16, 2015 at 10:59 PM, Jun Rao j...@confluent.io wrote: 101. There may be a use case where you only want the topics to be created manually by admins. Currently, you can do that by disabling auto topic creation and issue topic creation from the TopicCommand. If we disable auto topic creation completely on the broker and don't have a way to distinguish between topic creation requests from the regular clients and the admin, we can't support manual topic creation any more. I was thinking that another way of distinguishing the clients making the topic creation requests is using clientId. For example, the admin tool can set it to something like admin and the broker can treat that clientId specially. Also, there is a related discussion in KAFKA-2020. Currently, we do the following in TopicMetadataResponse: 1. If leader is not available, we set the partition level error code to LeaderNotAvailable. 2. If a non-leader replica is not available, we take that replica out of the assigned replica list and isr in the response. As an indication for doing that, we set the partition level error code to ReplicaNotAvailable. This has a few problems. First, ReplicaNotAvailable probably shouldn't be an error, at least for the normal producer/consumer clients that just want to find out the leader. Second, it can happen that both the leader and another replica are not available at the same time. There is no error code to indicate both. Third, even if a replica is not available, it's still useful to return its replica id since some clients (e.g. admin tool) may still make use of it. One way to address this issue is to always return the replica id for leader, assigned replicas, and isr regardless of whether the corresponding broker is live or not. Since we also return the list of live brokers, the client can figure out whether a leader or a replica is live or not and act accordingly. This way, we don't need to set the partition level error code when the leader or a replica is not available. This doesn't change the wire protocol, but does change the semantics. Since we are evolving the protocol of TopicMetadataRequest here, we can potentially piggyback the change. 102.1 For those types of errors due to invalid input, shouldn't we just guard it at parameter validation time and throw InvalidArgumentException without even sending the request to the broker? Thanks, Jun On Mon, Mar 16, 2015 at 10:37 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, Answering your questions: 101. If I understand you correctly, you are saying future producer versions (which will be ported to TMR_V1) won't be able to automatically create topic (if we unconditionally remove topic creation from there). But we need to this preserve logic. Ok, about your proposal: I'm not a big fan too, when it comes to differentiating clients directly in protocol schema. And also I'm not sure I understand at all why auto.create.topics.enable is a server side configuration. Can we deprecate this setting in future versions, add this setting to producer and based on that upon receiving UnknownTopic create topic explicitly by a separate producer call via adminClient? 102.1. Hm, yes. It's because we want to support batching and at the same time we want to give descriptive error messages for clients. Since AdminClient holds the context to construct such messages (e.g. AdminClient
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Guys, Thanks for a great discussion! Here are the actions points: 1. Q: Get rid of all scala requests objects, use java protocol definitions. A: Gwen kindly took that (KAFKA-1927). It's important to speed up review procedure there since this ticket blocks other important changes. 2. Q: Generic re-reroute facility vs client maintaining cluster state. A: Jay has added pseudo code to KAFKA-1912 - need to consider whether this will be easy to implement as a server-side feature (comments are welcomed!). 3. Q: Controller field in wire protocol. A: This might be useful for clients, add this to TopicMetadataResponse (already in KIP). 4. Q: Decoupling topic creation from TMR. A: I will add proposed by Jun solution (using clientId for that) to the KIP. 5. Q: Bumping new versions of TMR vs grabbing all protocol changes in one version. A: It was decided to try to gather all changes to protocol (before release). In case of TMR it worth checking: KAFKA-2020 and KIP-13 (quotas) 6. Q: JSON lib is needed to deserialize user's input in CLI tool. A: Use jackson for that, /tools project is a separate jar so shouldn't be a big deal. 7. Q: VerifyReassingPartitions vs generic status check command. A: For long-running requests like reassign partitions *progress* check request is useful, it makes sense to introduce it. Please add, correct me if I missed something. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 6:20 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition details) But since current TMR is involved in lots of places (including network client, as I understand) this might be very serious change and it probably makes sense to stick with current approach. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 5:29 PM, Joel Koshy jjkosh...@gmail.com wrote: I may be missing some context but hopefully this will also be covered today: I thought the earlier proposal where there was an explicit ClusterMetadata request was clearer and explicit. During the course of this thread I think the conclusion was that the main need was for controller information and that can be rolled into the topic metadata response but that seems a bit irrelevant to topic metadata. FWIW I think the full broker-list is also irrelevant to topic metadata, but it is already there and in use. I think there is still room for an explicit ClusterMetadata request since there may be other cluster-level information that we may want to add over time (and that have nothing to do with topic metadata). On Tue, Mar 17, 2015 at 02:45:30PM +0200, Andrii Biletskyi wrote: Jun, 101. Okay, if you say that such use case is important. I also think using clientId for these purposes is fine - if we already have this field as part of all Wire protocol messages, why not use that. I will update KIP-4 page if nobody has other ideas (which may come up during the call today). 102.1 Agree, I'll update the KIP accordingly. I think we can add new, fine-grained error codes if some error code received in specific case won't give enough context to return a descriptive error message for user. Look forward to discussing all outstanding issues in detail today during the call. Thanks, Andrii Biletskyi On Mon, Mar 16, 2015 at 10:59 PM, Jun Rao j...@confluent.io wrote: 101. There may be a use case where you only want the topics to be created manually by admins. Currently, you can do that by disabling auto topic creation and issue topic creation from the TopicCommand. If we disable auto topic creation completely on the broker and don't have a way to distinguish between topic creation requests from the regular clients and the admin, we can't support manual topic creation any more. I was thinking that another way of distinguishing the clients making the topic creation requests is using clientId. For example, the admin tool can set it to something like admin and the broker can treat that clientId specially.
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jun, 101. Okay, if you say that such use case is important. I also think using clientId for these purposes is fine - if we already have this field as part of all Wire protocol messages, why not use that. I will update KIP-4 page if nobody has other ideas (which may come up during the call today). 102.1 Agree, I'll update the KIP accordingly. I think we can add new, fine-grained error codes if some error code received in specific case won't give enough context to return a descriptive error message for user. Look forward to discussing all outstanding issues in detail today during the call. Thanks, Andrii Biletskyi On Mon, Mar 16, 2015 at 10:59 PM, Jun Rao j...@confluent.io wrote: 101. There may be a use case where you only want the topics to be created manually by admins. Currently, you can do that by disabling auto topic creation and issue topic creation from the TopicCommand. If we disable auto topic creation completely on the broker and don't have a way to distinguish between topic creation requests from the regular clients and the admin, we can't support manual topic creation any more. I was thinking that another way of distinguishing the clients making the topic creation requests is using clientId. For example, the admin tool can set it to something like admin and the broker can treat that clientId specially. Also, there is a related discussion in KAFKA-2020. Currently, we do the following in TopicMetadataResponse: 1. If leader is not available, we set the partition level error code to LeaderNotAvailable. 2. If a non-leader replica is not available, we take that replica out of the assigned replica list and isr in the response. As an indication for doing that, we set the partition level error code to ReplicaNotAvailable. This has a few problems. First, ReplicaNotAvailable probably shouldn't be an error, at least for the normal producer/consumer clients that just want to find out the leader. Second, it can happen that both the leader and another replica are not available at the same time. There is no error code to indicate both. Third, even if a replica is not available, it's still useful to return its replica id since some clients (e.g. admin tool) may still make use of it. One way to address this issue is to always return the replica id for leader, assigned replicas, and isr regardless of whether the corresponding broker is live or not. Since we also return the list of live brokers, the client can figure out whether a leader or a replica is live or not and act accordingly. This way, we don't need to set the partition level error code when the leader or a replica is not available. This doesn't change the wire protocol, but does change the semantics. Since we are evolving the protocol of TopicMetadataRequest here, we can potentially piggyback the change. 102.1 For those types of errors due to invalid input, shouldn't we just guard it at parameter validation time and throw InvalidArgumentException without even sending the request to the broker? Thanks, Jun On Mon, Mar 16, 2015 at 10:37 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, Answering your questions: 101. If I understand you correctly, you are saying future producer versions (which will be ported to TMR_V1) won't be able to automatically create topic (if we unconditionally remove topic creation from there). But we need to this preserve logic. Ok, about your proposal: I'm not a big fan too, when it comes to differentiating clients directly in protocol schema. And also I'm not sure I understand at all why auto.create.topics.enable is a server side configuration. Can we deprecate this setting in future versions, add this setting to producer and based on that upon receiving UnknownTopic create topic explicitly by a separate producer call via adminClient? 102.1. Hm, yes. It's because we want to support batching and at the same time we want to give descriptive error messages for clients. Since AdminClient holds the context to construct such messages (e.g. AdminClient layer can know that InvalidArgumentsCode means two cases: either invalid number - e.g. -1; or replication-factor was provided while partitions argument wasn't) - I wrapped responses in Exceptions. But I'm open to any other ideas, this was just initial version. 102.2. Yes, I agree. I'll change that to probably some other dto. Thanks, Andrii Biletskyi On Fri, Mar 13, 2015 at 7:16 PM, Jun Rao j...@confluent.io wrote: Andrii, 101. That's what I was thinking too, but it may not be that simple. In TopicMetadataRequest_V1, we can let it not trigger auto topic creation. Then, in the producer side, if it gets an UnknownTopicException, it can explicitly issue a createTopicRequest for auto topic creation. On the consumer side, it will never issue createTopicRequest. This works when auto topic creation is enabled on the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Joel, You are right, I removed ClusterMetadata because we have partially what we need in TopicMetadata. Also, as Jay pointed out earlier, we would like to have orthogonal API, but at the same time we need to be backward compatible. But I like your idea and even have some other arguments for this option: There is also DescribeTopicRequest which was proposed in this KIP, it returns topic configs, partitions, replication factor plus partition ISR, ASR, leader replica. The later part is really already there in TopicMetadataRequest. So again we'll have to add stuff to TMR, not to duplicate some info in newly added requests. However, this way we'll end up with monster request which returns cluster metadata, topic replication and config info plus partition replication data. Seems logical to split TMR to - ClusterMetadata (brokers + controller, maybe smth else) - TopicMetadata (topic info + partition details) But since current TMR is involved in lots of places (including network client, as I understand) this might be very serious change and it probably makes sense to stick with current approach. Thanks, Andrii Biletskyi On Tue, Mar 17, 2015 at 5:29 PM, Joel Koshy jjkosh...@gmail.com wrote: I may be missing some context but hopefully this will also be covered today: I thought the earlier proposal where there was an explicit ClusterMetadata request was clearer and explicit. During the course of this thread I think the conclusion was that the main need was for controller information and that can be rolled into the topic metadata response but that seems a bit irrelevant to topic metadata. FWIW I think the full broker-list is also irrelevant to topic metadata, but it is already there and in use. I think there is still room for an explicit ClusterMetadata request since there may be other cluster-level information that we may want to add over time (and that have nothing to do with topic metadata). On Tue, Mar 17, 2015 at 02:45:30PM +0200, Andrii Biletskyi wrote: Jun, 101. Okay, if you say that such use case is important. I also think using clientId for these purposes is fine - if we already have this field as part of all Wire protocol messages, why not use that. I will update KIP-4 page if nobody has other ideas (which may come up during the call today). 102.1 Agree, I'll update the KIP accordingly. I think we can add new, fine-grained error codes if some error code received in specific case won't give enough context to return a descriptive error message for user. Look forward to discussing all outstanding issues in detail today during the call. Thanks, Andrii Biletskyi On Mon, Mar 16, 2015 at 10:59 PM, Jun Rao j...@confluent.io wrote: 101. There may be a use case where you only want the topics to be created manually by admins. Currently, you can do that by disabling auto topic creation and issue topic creation from the TopicCommand. If we disable auto topic creation completely on the broker and don't have a way to distinguish between topic creation requests from the regular clients and the admin, we can't support manual topic creation any more. I was thinking that another way of distinguishing the clients making the topic creation requests is using clientId. For example, the admin tool can set it to something like admin and the broker can treat that clientId specially. Also, there is a related discussion in KAFKA-2020. Currently, we do the following in TopicMetadataResponse: 1. If leader is not available, we set the partition level error code to LeaderNotAvailable. 2. If a non-leader replica is not available, we take that replica out of the assigned replica list and isr in the response. As an indication for doing that, we set the partition level error code to ReplicaNotAvailable. This has a few problems. First, ReplicaNotAvailable probably shouldn't be an error, at least for the normal producer/consumer clients that just want to find out the leader. Second, it can happen that both the leader and another replica are not available at the same time. There is no error code to indicate both. Third, even if a replica is not available, it's still useful to return its replica id since some clients (e.g. admin tool) may still make use of it. One way to address this issue is to always return the replica id for leader, assigned replicas, and isr regardless of whether the corresponding broker is live or not. Since we also return the list of live brokers, the client can figure out whether a leader or a replica is live or not and act accordingly. This way, we don't need to set the partition level error code when the leader or a replica is not available. This doesn't change the wire protocol, but does change the semantics. Since we are evolving the protocol of TopicMetadataRequest here, we can potentially piggyback the change. 102.1 For those
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
101. There may be a use case where you only want the topics to be created manually by admins. Currently, you can do that by disabling auto topic creation and issue topic creation from the TopicCommand. If we disable auto topic creation completely on the broker and don't have a way to distinguish between topic creation requests from the regular clients and the admin, we can't support manual topic creation any more. I was thinking that another way of distinguishing the clients making the topic creation requests is using clientId. For example, the admin tool can set it to something like admin and the broker can treat that clientId specially. Also, there is a related discussion in KAFKA-2020. Currently, we do the following in TopicMetadataResponse: 1. If leader is not available, we set the partition level error code to LeaderNotAvailable. 2. If a non-leader replica is not available, we take that replica out of the assigned replica list and isr in the response. As an indication for doing that, we set the partition level error code to ReplicaNotAvailable. This has a few problems. First, ReplicaNotAvailable probably shouldn't be an error, at least for the normal producer/consumer clients that just want to find out the leader. Second, it can happen that both the leader and another replica are not available at the same time. There is no error code to indicate both. Third, even if a replica is not available, it's still useful to return its replica id since some clients (e.g. admin tool) may still make use of it. One way to address this issue is to always return the replica id for leader, assigned replicas, and isr regardless of whether the corresponding broker is live or not. Since we also return the list of live brokers, the client can figure out whether a leader or a replica is live or not and act accordingly. This way, we don't need to set the partition level error code when the leader or a replica is not available. This doesn't change the wire protocol, but does change the semantics. Since we are evolving the protocol of TopicMetadataRequest here, we can potentially piggyback the change. 102.1 For those types of errors due to invalid input, shouldn't we just guard it at parameter validation time and throw InvalidArgumentException without even sending the request to the broker? Thanks, Jun On Mon, Mar 16, 2015 at 10:37 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, Answering your questions: 101. If I understand you correctly, you are saying future producer versions (which will be ported to TMR_V1) won't be able to automatically create topic (if we unconditionally remove topic creation from there). But we need to this preserve logic. Ok, about your proposal: I'm not a big fan too, when it comes to differentiating clients directly in protocol schema. And also I'm not sure I understand at all why auto.create.topics.enable is a server side configuration. Can we deprecate this setting in future versions, add this setting to producer and based on that upon receiving UnknownTopic create topic explicitly by a separate producer call via adminClient? 102.1. Hm, yes. It's because we want to support batching and at the same time we want to give descriptive error messages for clients. Since AdminClient holds the context to construct such messages (e.g. AdminClient layer can know that InvalidArgumentsCode means two cases: either invalid number - e.g. -1; or replication-factor was provided while partitions argument wasn't) - I wrapped responses in Exceptions. But I'm open to any other ideas, this was just initial version. 102.2. Yes, I agree. I'll change that to probably some other dto. Thanks, Andrii Biletskyi On Fri, Mar 13, 2015 at 7:16 PM, Jun Rao j...@confluent.io wrote: Andrii, 101. That's what I was thinking too, but it may not be that simple. In TopicMetadataRequest_V1, we can let it not trigger auto topic creation. Then, in the producer side, if it gets an UnknownTopicException, it can explicitly issue a createTopicRequest for auto topic creation. On the consumer side, it will never issue createTopicRequest. This works when auto topic creation is enabled on the broker side. However, I am not sure how things will work when auto topic creation is disabled on the broker side. In this case, we want to have a way to manually create a topic, potentially through admin commands. However, then we need a way to distinguish createTopicRequest issued from the producer clients and the admin tools. May be we can add a new field in createTopicRequest and set it differently in the producer client and the admin client. However, I am not sure if that's the best approach. 2. Yes, refactoring existing requests is a non-trivial amount of work. I posted some comments in KAFKA-1927. We will probably have to fix KAFKA-1927 first, before adding the new logic in KAFKA-1694. Otherwise, the changes will be too big. 102. About the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jun, Answering your questions: 101. If I understand you correctly, you are saying future producer versions (which will be ported to TMR_V1) won't be able to automatically create topic (if we unconditionally remove topic creation from there). But we need to this preserve logic. Ok, about your proposal: I'm not a big fan too, when it comes to differentiating clients directly in protocol schema. And also I'm not sure I understand at all why auto.create.topics.enable is a server side configuration. Can we deprecate this setting in future versions, add this setting to producer and based on that upon receiving UnknownTopic create topic explicitly by a separate producer call via adminClient? 102.1. Hm, yes. It's because we want to support batching and at the same time we want to give descriptive error messages for clients. Since AdminClient holds the context to construct such messages (e.g. AdminClient layer can know that InvalidArgumentsCode means two cases: either invalid number - e.g. -1; or replication-factor was provided while partitions argument wasn't) - I wrapped responses in Exceptions. But I'm open to any other ideas, this was just initial version. 102.2. Yes, I agree. I'll change that to probably some other dto. Thanks, Andrii Biletskyi On Fri, Mar 13, 2015 at 7:16 PM, Jun Rao j...@confluent.io wrote: Andrii, 101. That's what I was thinking too, but it may not be that simple. In TopicMetadataRequest_V1, we can let it not trigger auto topic creation. Then, in the producer side, if it gets an UnknownTopicException, it can explicitly issue a createTopicRequest for auto topic creation. On the consumer side, it will never issue createTopicRequest. This works when auto topic creation is enabled on the broker side. However, I am not sure how things will work when auto topic creation is disabled on the broker side. In this case, we want to have a way to manually create a topic, potentially through admin commands. However, then we need a way to distinguish createTopicRequest issued from the producer clients and the admin tools. May be we can add a new field in createTopicRequest and set it differently in the producer client and the admin client. However, I am not sure if that's the best approach. 2. Yes, refactoring existing requests is a non-trivial amount of work. I posted some comments in KAFKA-1927. We will probably have to fix KAFKA-1927 first, before adding the new logic in KAFKA-1694. Otherwise, the changes will be too big. 102. About the AdminClient: 102.1. It's a bit weird that we return exception in the api. It seems that we should either return error code or throw an exception when getting the response state. 102.2. We probably shouldn't explicitly use the request object in the api. Not every request evolution requires an api change. Thanks, Jun On Fri, Mar 13, 2015 at 4:08 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, Thanks for you comments. Answers inline: 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? Yes, now with Admin Client this looks a bit weird. My initial motivation was: ReassignPartitionCommand accepts input in json, we want to remain tools' interfaces unchanged, where possible. If we port it to deserialized format, in CLI (/tools project) we will have to add some json library since /tools is written in java and we'll need to deserialize json file provided by a user. Can we quickly agree on what this library should be (Jackson, GSON, whatever)? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and TopicMetadataRequest v1 will be used in the producer/consumer client, in addition to admin tools? For example, ideally, we don't want TopicMetadataRequest from the consumer to trigger auto topic creation. I agree, this strange logic should be fixed. I'm not confident in this Kafka part so correct me if I'm wrong, but it doesn't look like a hard thing to do, I think we can leverage AdminClient for that in Producer and unconditionally remove topic creation from the TopicMetadataRequest_V1. 2. I think Jay meant getting rid of scala classes like HeartbeatRequestAndHeader and HeartbeatResponseAndHeader. We did that as a stop-gap thing when adding the new requests for the consumers. However, the long term plan is to get rid of all those and just reuse the java request/response in the client. Since this KIP proposes to add a significant number of new requests, perhaps we should bite the bullet to clean up the existing scala requests first before adding new ones? Yes,
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andrii, 101. That's what I was thinking too, but it may not be that simple. In TopicMetadataRequest_V1, we can let it not trigger auto topic creation. Then, in the producer side, if it gets an UnknownTopicException, it can explicitly issue a createTopicRequest for auto topic creation. On the consumer side, it will never issue createTopicRequest. This works when auto topic creation is enabled on the broker side. However, I am not sure how things will work when auto topic creation is disabled on the broker side. In this case, we want to have a way to manually create a topic, potentially through admin commands. However, then we need a way to distinguish createTopicRequest issued from the producer clients and the admin tools. May be we can add a new field in createTopicRequest and set it differently in the producer client and the admin client. However, I am not sure if that's the best approach. 2. Yes, refactoring existing requests is a non-trivial amount of work. I posted some comments in KAFKA-1927. We will probably have to fix KAFKA-1927 first, before adding the new logic in KAFKA-1694. Otherwise, the changes will be too big. 102. About the AdminClient: 102.1. It's a bit weird that we return exception in the api. It seems that we should either return error code or throw an exception when getting the response state. 102.2. We probably shouldn't explicitly use the request object in the api. Not every request evolution requires an api change. Thanks, Jun On Fri, Mar 13, 2015 at 4:08 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jun, Thanks for you comments. Answers inline: 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? Yes, now with Admin Client this looks a bit weird. My initial motivation was: ReassignPartitionCommand accepts input in json, we want to remain tools' interfaces unchanged, where possible. If we port it to deserialized format, in CLI (/tools project) we will have to add some json library since /tools is written in java and we'll need to deserialize json file provided by a user. Can we quickly agree on what this library should be (Jackson, GSON, whatever)? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and TopicMetadataRequest v1 will be used in the producer/consumer client, in addition to admin tools? For example, ideally, we don't want TopicMetadataRequest from the consumer to trigger auto topic creation. I agree, this strange logic should be fixed. I'm not confident in this Kafka part so correct me if I'm wrong, but it doesn't look like a hard thing to do, I think we can leverage AdminClient for that in Producer and unconditionally remove topic creation from the TopicMetadataRequest_V1. 2. I think Jay meant getting rid of scala classes like HeartbeatRequestAndHeader and HeartbeatResponseAndHeader. We did that as a stop-gap thing when adding the new requests for the consumers. However, the long term plan is to get rid of all those and just reuse the java request/response in the client. Since this KIP proposes to add a significant number of new requests, perhaps we should bite the bullet to clean up the existing scala requests first before adding new ones? Yes, looks like I misunderstood the point of ...RequestAndHeader. Okay, I will rework that. The only thing is that I don't see any example how it was done for at least one existing protocol message. Thus, as I understand, I have to think how we are going to do it. Re porting all existing RQ/RP in this patch. Sounds reasonable, but if it's an *obligatory* requirement to have Admin KIP done, I'm afraid this can be a serious blocker for us. There are 13 protocol messages and all that would require not only unit tests but quite intensive manual testing, no? I'm afraid I'm not the right guy to cover pretty much all Kafka core internals :). Let me know your thoughts on this item. Btw there is a ticket to follow-up this issue (https://issues.apache.org/jira/browse/KAFKA-2006). Thanks, Andrii Biletskyi On Fri, Mar 13, 2015 at 6:40 AM, Jun Rao j...@confluent.io wrote: Andrii, A few more comments. 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jun, Thanks for you comments. Answers inline: 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? Yes, now with Admin Client this looks a bit weird. My initial motivation was: ReassignPartitionCommand accepts input in json, we want to remain tools' interfaces unchanged, where possible. If we port it to deserialized format, in CLI (/tools project) we will have to add some json library since /tools is written in java and we'll need to deserialize json file provided by a user. Can we quickly agree on what this library should be (Jackson, GSON, whatever)? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and TopicMetadataRequest v1 will be used in the producer/consumer client, in addition to admin tools? For example, ideally, we don't want TopicMetadataRequest from the consumer to trigger auto topic creation. I agree, this strange logic should be fixed. I'm not confident in this Kafka part so correct me if I'm wrong, but it doesn't look like a hard thing to do, I think we can leverage AdminClient for that in Producer and unconditionally remove topic creation from the TopicMetadataRequest_V1. 2. I think Jay meant getting rid of scala classes like HeartbeatRequestAndHeader and HeartbeatResponseAndHeader. We did that as a stop-gap thing when adding the new requests for the consumers. However, the long term plan is to get rid of all those and just reuse the java request/response in the client. Since this KIP proposes to add a significant number of new requests, perhaps we should bite the bullet to clean up the existing scala requests first before adding new ones? Yes, looks like I misunderstood the point of ...RequestAndHeader. Okay, I will rework that. The only thing is that I don't see any example how it was done for at least one existing protocol message. Thus, as I understand, I have to think how we are going to do it. Re porting all existing RQ/RP in this patch. Sounds reasonable, but if it's an *obligatory* requirement to have Admin KIP done, I'm afraid this can be a serious blocker for us. There are 13 protocol messages and all that would require not only unit tests but quite intensive manual testing, no? I'm afraid I'm not the right guy to cover pretty much all Kafka core internals :). Let me know your thoughts on this item. Btw there is a ticket to follow-up this issue (https://issues.apache.org/jira/browse/KAFKA-2006). Thanks, Andrii Biletskyi On Fri, Mar 13, 2015 at 6:40 AM, Jun Rao j...@confluent.io wrote: Andrii, A few more comments. 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and TopicMetadataRequest v1 will be used in the producer/consumer client, in addition to admin tools? For example, ideally, we don't want TopicMetadataRequest from the consumer to trigger auto topic creation. 2. I think Jay meant getting rid of scala classes like HeartbeatRequestAndHeader and HeartbeatResponseAndHeader. We did that as a stop-gap thing when adding the new requests for the consumers. However, the long term plan is to get rid of all those and just reuse the java request/response in the client. Since this KIP proposes to add a significant number of new requests, perhaps we should bite the bullet to clean up the existing scala requests first before adding new ones? Thanks, Jun On Thu, Mar 12, 2015 at 3:37 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi, As said above - I list again all comments from this thread so we can see what's left and finalize all pending issues. Comments from Jay: 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. A: Definitely behind this. Would appreciate if there are concrete comments how this can be improved. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. A:
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Guozhang and Tong, I really do like this idea and where your discussion will lead as it will be very useful for folks to have. I am really concerned though that we are scope creeping this KIP. Andrii is already working on following up on ~ 14 different items of feedback in regards to the core motivations/scope of the KIP. He has uploaded a new patch already and the KIP based on those items and will be responding to this thread about that and for what else still requires discussion hopefully in the next few hours. I want to make sure we are focusing on the open items still requiring discussion and stabilizing what we have before trying to introducing more new features. Perhaps a new KIP can get added for the new features you are talking about which can reference this and once this is committed that work can begin for folks that are able to contribute to work on it? ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 9:51 AM, Tong Li liton...@us.ibm.com wrote: Guozhang, augmenting topic is fine, but as soon as we start doing that, other issues follow, for example, access control, who can access the topic, who can grant permissions. how the information (metadata) itself gets secured. Should the information be saved in ZK or a datastore? Will using a metadata file causing long term problems such as file updates/synchronization, once we have this metadata file, more people will want to put more stuff in it. how can we control the format? K-V pair not good for large data set. Clearly there is a need for it, I wonder if we can make this thing plugable and provide a default implementation which allows us try different solutions and also allow people to completely ignore it if they do not want to deal with any of these. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com [image: Inactive hide details for Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topi]Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topic metadata and batching From: Guozhang Wang wangg...@gmail.com To: dev@kafka.apache.org dev@kafka.apache.org Date: 03/12/2015 09:39 AM Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations -- Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Guozhang, augmenting topic is fine, but as soon as we start doing that, other issues follow, for example, access control, who can access the topic, who can grant permissions. how the information (metadata) itself gets secured. Should the information be saved in ZK or a datastore? Will using a metadata file causing long term problems such as file updates/synchronization, once we have this metadata file, more people will want to put more stuff in it. how can we control the format? K-V pair not good for large data set. Clearly there is a need for it, I wonder if we can make this thing plugable and provide a default implementation which allows us try different solutions and also allow people to completely ignore it if they do not want to deal with any of these. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com From: Guozhang Wang wangg...@gmail.com To: dev@kafka.apache.org dev@kafka.apache.org Date: 03/12/2015 09:39 AM Subject:Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Andrii, A few more comments. 100. There are a few fields such as ReplicaAssignment, ReassignPartitionRequest, and PartitionsSerialized that are represented as a string, but contain composite structures in json. Could we flatten them out directly in the protocol definition as arrays/records? 101. Does TopicMetadataRequest v1 still trigger auto topic creation? This will be a bit weird now that we have a separate topic creation api. Have you thought about how the new createTopicRequest and TopicMetadataRequest v1 will be used in the producer/consumer client, in addition to admin tools? For example, ideally, we don't want TopicMetadataRequest from the consumer to trigger auto topic creation. 2. I think Jay meant getting rid of scala classes like HeartbeatRequestAndHeader and HeartbeatResponseAndHeader. We did that as a stop-gap thing when adding the new requests for the consumers. However, the long term plan is to get rid of all those and just reuse the java request/response in the client. Since this KIP proposes to add a significant number of new requests, perhaps we should bite the bullet to clean up the existing scala requests first before adding new ones? Thanks, Jun On Thu, Mar 12, 2015 at 3:37 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi, As said above - I list again all comments from this thread so we can see what's left and finalize all pending issues. Comments from Jay: 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. A: Definitely behind this. Would appreciate if there are concrete comments how this can be improved. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. A: Fixed in the latest patch - removed scala protocol classes. 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. A: Fixed in the latest patch - removed MaybeOf type and changed protocol accordingly. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? A: I agree. Updated the KIP. Let's extends TopicMetadata to version 2 and include controller. 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. A: It's a very interesting idea, but seems there are some concerns about this feature (like performance considerations, how this will complicate server etc). I believe this shouldn't be a blocker. If this feature is implemented at some point it won't affect Admin changes - at least no changes to public API will be required. 6. We should probably normalize the key value pairs used for configs rather than embedding a new formatting. So two strings rather than one with an internal equals sign. A: Fixed in the latest patch - normalized configs and changed protocol accordingly. 7. Is the postcondition of these APIs that the command has begun or that the command has been completed? It is a lot more usable if the command has been completed so you know that if you create a topic and then publish to it you won't get an exception about there being no such topic. A: For long running requests (like reassign partitions) - the post condition is command has begun - so we don't block the client. In case of your example - topic commands, this will be refactored and topic commands will be executed immediately, since the Controller will serve Admin requests (follow-up ticket KAFKA-1777). 8. Describe topic and list topics duplicate a lot of stuff in the metadata request. Is there a reason to give back topics marked for deletion? I feel like if we just make the post-condition of the delete command be that the topic is deleted that will
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Since we are for the first time defining a bunch of new request formats, I feel it is better to think through the its possible common use cases and try to incorporate them Agreed providing we are only talking about the fields and not the implementation of the functionality. I worry (only a little) about incorporating fields that are not used initially but whole heartily believe doing so will outweigh the pre-optimization criticism because of the requirement to version the protocol (as you brought up). We can then use those fields later without actually implementing the functionality now. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 11:08 AM, Guozhang Wang wangg...@gmail.com wrote: The reason I want to bring it up sooner than later is that future changing a defined request protocol takes quite some effort: we need to bump up the version of the request, bump up the ZK path data version, and make sure server can handle old versions as well as new ones both from clients and from ZK, etc. Since we are for the first time defining a bunch of new request formats, I feel it is better to think through the its possible common use cases and try to incorporate them, but I am also fine with creating another KIP if most people feel it drags too long. Guozhang On Thu, Mar 12, 2015 at 7:34 AM, Joe Stein joe.st...@stealth.ly wrote: Guozhang and Tong, I really do like this idea and where your discussion will lead as it will be very useful for folks to have. I am really concerned though that we are scope creeping this KIP. Andrii is already working on following up on ~ 14 different items of feedback in regards to the core motivations/scope of the KIP. He has uploaded a new patch already and the KIP based on those items and will be responding to this thread about that and for what else still requires discussion hopefully in the next few hours. I want to make sure we are focusing on the open items still requiring discussion and stabilizing what we have before trying to introducing more new features. Perhaps a new KIP can get added for the new features you are talking about which can reference this and once this is committed that work can begin for folks that are able to contribute to work on it? ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 9:51 AM, Tong Li liton...@us.ibm.com wrote: Guozhang, augmenting topic is fine, but as soon as we start doing that, other issues follow, for example, access control, who can access the topic, who can grant permissions. how the information (metadata) itself gets secured. Should the information be saved in ZK or a datastore? Will using a metadata file causing long term problems such as file updates/synchronization, once we have this metadata file, more people will want to put more stuff in it. how can we control the format? K-V pair not good for large data set. Clearly there is a need for it, I wonder if we can make this thing plugable and provide a default implementation which allows us try different solutions and also allow people to completely ignore it if they do not want to deal with any of these. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com [image: Inactive hide details for Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topi]Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topic metadata and batching From: Guozhang Wang wangg...@gmail.com To: dev@kafka.apache.org dev@kafka.apache.org Date: 03/12/2015 09:39 AM Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations -- Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Yeah I want to second this. For the protocol we should really start by writing out the end state we want. Then we can figure out how to get there in small, reasonable steps to avoid boiling the ocean in implementation. -Jay On Thu, Mar 12, 2015 at 8:27 AM, Joe Stein joe.st...@stealth.ly wrote: Since we are for the first time defining a bunch of new request formats, I feel it is better to think through the its possible common use cases and try to incorporate them Agreed providing we are only talking about the fields and not the implementation of the functionality. I worry (only a little) about incorporating fields that are not used initially but whole heartily believe doing so will outweigh the pre-optimization criticism because of the requirement to version the protocol (as you brought up). We can then use those fields later without actually implementing the functionality now. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 11:08 AM, Guozhang Wang wangg...@gmail.com wrote: The reason I want to bring it up sooner than later is that future changing a defined request protocol takes quite some effort: we need to bump up the version of the request, bump up the ZK path data version, and make sure server can handle old versions as well as new ones both from clients and from ZK, etc. Since we are for the first time defining a bunch of new request formats, I feel it is better to think through the its possible common use cases and try to incorporate them, but I am also fine with creating another KIP if most people feel it drags too long. Guozhang On Thu, Mar 12, 2015 at 7:34 AM, Joe Stein joe.st...@stealth.ly wrote: Guozhang and Tong, I really do like this idea and where your discussion will lead as it will be very useful for folks to have. I am really concerned though that we are scope creeping this KIP. Andrii is already working on following up on ~ 14 different items of feedback in regards to the core motivations/scope of the KIP. He has uploaded a new patch already and the KIP based on those items and will be responding to this thread about that and for what else still requires discussion hopefully in the next few hours. I want to make sure we are focusing on the open items still requiring discussion and stabilizing what we have before trying to introducing more new features. Perhaps a new KIP can get added for the new features you are talking about which can reference this and once this is committed that work can begin for folks that are able to contribute to work on it? ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 9:51 AM, Tong Li liton...@us.ibm.com wrote: Guozhang, augmenting topic is fine, but as soon as we start doing that, other issues follow, for example, access control, who can access the topic, who can grant permissions. how the information (metadata) itself gets secured. Should the information be saved in ZK or a datastore? Will using a metadata file causing long term problems such as file updates/synchronization, once we have this metadata file, more people will want to put more stuff in it. how can we control the format? K-V pair not good for large data set. Clearly there is a need for it, I wonder if we can make this thing plugable and provide a default implementation which allows us try different solutions and also allow people to completely ignore it if they do not want to deal with any of these. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com [image: Inactive hide details for Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topi]Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topic metadata and batching From: Guozhang Wang wangg...@gmail.com To: dev@kafka.apache.org dev@kafka.apache.org Date: 03/12/2015 09:39 AM Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations -- Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
The reason I want to bring it up sooner than later is that future changing a defined request protocol takes quite some effort: we need to bump up the version of the request, bump up the ZK path data version, and make sure server can handle old versions as well as new ones both from clients and from ZK, etc. Since we are for the first time defining a bunch of new request formats, I feel it is better to think through the its possible common use cases and try to incorporate them, but I am also fine with creating another KIP if most people feel it drags too long. Guozhang On Thu, Mar 12, 2015 at 7:34 AM, Joe Stein joe.st...@stealth.ly wrote: Guozhang and Tong, I really do like this idea and where your discussion will lead as it will be very useful for folks to have. I am really concerned though that we are scope creeping this KIP. Andrii is already working on following up on ~ 14 different items of feedback in regards to the core motivations/scope of the KIP. He has uploaded a new patch already and the KIP based on those items and will be responding to this thread about that and for what else still requires discussion hopefully in the next few hours. I want to make sure we are focusing on the open items still requiring discussion and stabilizing what we have before trying to introducing more new features. Perhaps a new KIP can get added for the new features you are talking about which can reference this and once this is committed that work can begin for folks that are able to contribute to work on it? ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Mar 12, 2015 at 9:51 AM, Tong Li liton...@us.ibm.com wrote: Guozhang, augmenting topic is fine, but as soon as we start doing that, other issues follow, for example, access control, who can access the topic, who can grant permissions. how the information (metadata) itself gets secured. Should the information be saved in ZK or a datastore? Will using a metadata file causing long term problems such as file updates/synchronization, once we have this metadata file, more people will want to put more stuff in it. how can we control the format? K-V pair not good for large data set. Clearly there is a need for it, I wonder if we can make this thing plugable and provide a default implementation which allows us try different solutions and also allow people to completely ignore it if they do not want to deal with any of these. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com [image: Inactive hide details for Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topi]Guozhang Wang ---03/12/2015 09:39:50 AM---Folks, Just want to elaborate a bit more on the create-topic metadata and batching From: Guozhang Wang wangg...@gmail.com To: dev@kafka.apache.org dev@kafka.apache.org Date: 03/12/2015 09:39 AM Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations -- Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Found KIP-11 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface) It actually specifies changes to the Metadata protocol, so making sure both KIPs are consistent in this regard will be good. On Thu, Mar 12, 2015 at 12:21 PM, Gwen Shapira gshap...@cloudera.com wrote: Specifically for ownership, I think the plan is to add ACL (it sounds like you are describing ACL) via an external system (Argus, Sentry). I remember KIP-11 described this, but I can't find the KIP any longer. Regardless, I think KIP-4 focuses on getting information that already exists from Kafka brokers, not on adding information that perhaps should exist but doesn't yet? Gwen On Thu, Mar 12, 2015 at 6:37 AM, Guozhang Wang wangg...@gmail.com wrote: Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Specifically for ownership, I think the plan is to add ACL (it sounds like you are describing ACL) via an external system (Argus, Sentry). I remember KIP-11 described this, but I can't find the KIP any longer. Regardless, I think KIP-4 focuses on getting information that already exists from Kafka brokers, not on adding information that perhaps should exist but doesn't yet? Gwen On Thu, Mar 12, 2015 at 6:37 AM, Guozhang Wang wangg...@gmail.com wrote: Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hi, As said above - I list again all comments from this thread so we can see what's left and finalize all pending issues. Comments from Jay: 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. A: Definitely behind this. Would appreciate if there are concrete comments how this can be improved. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. A: Fixed in the latest patch - removed scala protocol classes. 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. A: Fixed in the latest patch - removed MaybeOf type and changed protocol accordingly. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? A: I agree. Updated the KIP. Let's extends TopicMetadata to version 2 and include controller. 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. A: It's a very interesting idea, but seems there are some concerns about this feature (like performance considerations, how this will complicate server etc). I believe this shouldn't be a blocker. If this feature is implemented at some point it won't affect Admin changes - at least no changes to public API will be required. 6. We should probably normalize the key value pairs used for configs rather than embedding a new formatting. So two strings rather than one with an internal equals sign. A: Fixed in the latest patch - normalized configs and changed protocol accordingly. 7. Is the postcondition of these APIs that the command has begun or that the command has been completed? It is a lot more usable if the command has been completed so you know that if you create a topic and then publish to it you won't get an exception about there being no such topic. A: For long running requests (like reassign partitions) - the post condition is command has begun - so we don't block the client. In case of your example - topic commands, this will be refactored and topic commands will be executed immediately, since the Controller will serve Admin requests (follow-up ticket KAFKA-1777). 8. Describe topic and list topics duplicate a lot of stuff in the metadata request. Is there a reason to give back topics marked for deletion? I feel like if we just make the post-condition of the delete command be that the topic is deleted that will get rid of the need for this right? And it will be much more intuitive. A: Fixed in the latest patch - removed topics marked for deletion in ListTopicsRequest. 9. Should we consider batching these requests? We have generally tried to allow multiple operations to be batched. My suspicion is that without this we will get a lot of code that does something like for(topic: adminClient.listTopics()) adminClient.describeTopic(topic) this code will work great when you test on 5 topics but not do as well if you have 50k. A: Updated the KIP - please check Topic Admin Schema section. 10. I think we should also discuss how we want to expose a programmatic JVM client api for these operations. Currently people rely on AdminUtils which is totally sketchy. I think we probably need another client under clients/ that exposes administrative functionality. We will need this just to properly test the new apis, I suspect. We should figure out that API. A: Updated the KIP - please check Admin Client section with an initial API proposal. 11. The other information that would be really useful to get would be information about partitions--how much data is in the partition, what are the segment offsets, what is the log-end offset (i.e. last offset), what is the compaction point, etc. I think that done right this would be the successor to
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Gwen, My main motivation is not authenticate via ownership, but rather query topics via ownership, and more generally query topics via patterns, where a pattern could be a config value, metadata k-v pair, etc. Does it make sense? Guozhang On Thu, Mar 12, 2015 at 12:26 PM, Gwen Shapira gshap...@cloudera.com wrote: Found KIP-11 ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface ) It actually specifies changes to the Metadata protocol, so making sure both KIPs are consistent in this regard will be good. On Thu, Mar 12, 2015 at 12:21 PM, Gwen Shapira gshap...@cloudera.com wrote: Specifically for ownership, I think the plan is to add ACL (it sounds like you are describing ACL) via an external system (Argus, Sentry). I remember KIP-11 described this, but I can't find the KIP any longer. Regardless, I think KIP-4 focuses on getting information that already exists from Kafka brokers, not on adding information that perhaps should exist but doesn't yet? Gwen On Thu, Mar 12, 2015 at 6:37 AM, Guozhang Wang wangg...@gmail.com wrote: Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hi all, Today I uploaded the patch that covers some of the discussed and agreed items: - removed MaybeOf optional type - switched to java protocol definitions - simplified messages (normalized configs, removed topic marked for deletion) I also updated the KIP-4 with respective changes and wrote down my proposal for pending items: - Batch Admin Operations - updated Wire Protocol schema proposal - Remove ClusterMetadata - changed to extend TopicMetadataRequest - Admin Client - updated my initial proposal to reflect batching - Error codes - proposed fine-grained error code instead of AdminRequestFailed I will also send a separate email to cover all comments from this thread. Thanks, Andrii Biletskyi On Thu, Mar 12, 2015 at 9:26 PM, Gwen Shapira gshap...@cloudera.com wrote: Found KIP-11 ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface ) It actually specifies changes to the Metadata protocol, so making sure both KIPs are consistent in this regard will be good. On Thu, Mar 12, 2015 at 12:21 PM, Gwen Shapira gshap...@cloudera.com wrote: Specifically for ownership, I think the plan is to add ACL (it sounds like you are describing ACL) via an external system (Argus, Sentry). I remember KIP-11 described this, but I can't find the KIP any longer. Regardless, I think KIP-4 focuses on getting information that already exists from Kafka brokers, not on adding information that perhaps should exist but doesn't yet? Gwen On Thu, Mar 12, 2015 at 6:37 AM, Guozhang Wang wangg...@gmail.com wrote: Folks, Just want to elaborate a bit more on the create-topic metadata and batching describe-topic based on config / metadata in my previous email as we work on KAFKA-1694. The main motivation is to have some sort of topic management mechanisms, which I think is quite important in a multi-tenant / cloud architecture: today anyone can create topics in a shared Kafka cluster, but there is no concept or ownership of topics that are created by different users. For example, at LinkedIn we basically distinguish topic owners via some casual topic name prefix, which is a bit awkward and does not fly as we scale our customers. It would be great to use describe-topics such as: Describe all topics that is created by me. Describe all topics whose retention time is overriden to X. Describe all topics whose writable group include user Y (this is related to authorization), etc.. One possible way to achieve this is to add a metadata file in the create-topic request, whose value will also be written ZK as we create the topic; then describe-topics can choose to batch topics based on 1) name regex, 2) config K-V matching, 3) metadata regex, etc. Thoughts? Guozhang On Thu, Mar 5, 2015 at 4:37 PM, Guozhang Wang wangg...@gmail.com wrote: Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Thanks for the updated wiki. A few comments below: 1. Error description in response: I think if some errorCode could indicate several different error cases then we should really change it to multiple codes. In general the errorCode itself would be precise and sufficient for describing the server side errors. 2. Describe topic request: it would be great to go beyond just batching on topic name regex for this request. For example, a very common use case of the topic command is to list all topics whose config A's value is B. With topic name regex then we have to first retrieve __all__ topics's description info and then filter at the client end, which will be a huge burden on ZK. 3. Config K-Vs in create topic: this is related to the previous point; maybe we can add another metadata K-V or just a metadata string along side with config K-V in create topic like we did for offset commit request. This field can be quite useful in storing information like owner of the topic who issue the create command, etc, which is quite important for a multi-tenant setting. Then in the describe topic request we can also batch on regex of the metadata field. 4. Today all the admin operations are async in the sense that command will return once it is written in ZK, and that is why we need extra verification like testUtil.waitForTopicCreated() / verify partition reassignment request, etc. With admin requests we could add a flag to enable / disable synchronous requests; when it is turned on, the response will not return until the request has been completed. And for async requests we can add a token field in the response, and then only need a general admin verification request with the given token to check if the async request has been completed. 5. +1 for extending Metadata request to include controller / coordinator information, and then we can remove the ConsumerMetadata / ClusterMetadata requests. Guozhang On Tue, Mar 3, 2015 at 10:23 AM, Joel Koshy jjkosh...@gmail.com wrote: Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week.
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
It would also be good to think through how we can use those new admin requests in the producer/consumer client as well. Currently, both the producer and the consumer use TopicMetadataRequest to obtain the metadata, which will trigger a topic creation if auto topic creation is enabled. This is a bit weird for the consumer since the reader shouldn't be creating new topics. With the new admin requests, we can potentially decouple topic creation from obtaining the metadata. The consumer can just be issuing the metadata requests without triggering the topic creation. The producer can fetch the metadata first and then issue a create topic request if the topic doesn't exist. We will have to think through how this works with the auto topic creation logic though. Thanks, Jun On Mon, Mar 2, 2015 at 9:16 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Thanks for sending that out Joe - I don't think I will be able to make it today, so if notes can be sent out afterward that would be great. On Mon, Mar 02, 2015 at 09:16:13AM -0800, Gwen Shapira wrote: Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Thanks for sending this out Joe. Looking forward to chatting with everyone :) On Mon, Mar 2, 2015 at 6:46 AM, Joe Stein joe.st...@stealth.ly wrote: Hey, I just sent out a google hangout invite to all pmc, committers and everyone I found working on a KIP. If I missed anyone in the invite please let me know and can update it, np. We should do this every Tuesday @ 2pm Eastern Time. Maybe we can get INFRA help to make a google account so we can manage better? To discuss https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals in progress and related JIRA that are interdependent and common work. ~ Joe Stein On Tue, Feb 24, 2015 at 2:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Let's stay on Google hangouts that will also record and make the sessions available on youtube. -Jay On Tue, Feb 24, 2015 at 11:49 AM, Jeff Holoman jholo...@cloudera.com wrote: Jay / Joe We're happy to send out a Webex for this purpose. We could record the sessions if there is interest and publish them out. Thanks Jeff On Tue, Feb 24, 2015 at 11:28 AM, Jay Kreps jay.kr...@gmail.com wrote: Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty if successful) AlterTopicRequest - TopicNameRegexp Partitions ReplicaAssignment [AddedConfig] [DeletedConfig] AlterTopicResponse - [TopicName ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription CommandErrorCode = int16 CommandErrorDescription = string (nonempty in case of fatal error, e.g. we couldn't get topics by regexp) DescribeTopicRequest - TopicNameRegexp DescribeTopicResponse - [TopicName TopicDescription ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription Also, any thoughts about our discussion regarding re-routing facility? In my understanding, it is like between augmenting TopicMetadataRequest (to include at least controllerId) and implementing new generic re-routing facility so sending messages to controller will be handled by it. Thanks, Andrii Biletskyi On Mon, Feb 16, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: @Guozhang: Thanks for your comments, I've answered some of those. The main thing is having merged request for create-alter-delete-describe - I have some concerns about this approach. @*Jay*: I see that introduced ClusterMetadaRequest is also one of the concerns. We can solve it if we implement
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Let's try to get the technical hang-ups sorted out, though. I really think there is some benefit to live discussion vs writing. I am hopeful that if we post instructions and give ourselves a few attempts we can get it working. Tuesday at that time would work for me...any objections? -Jay On Tue, Feb 24, 2015 at 8:18 AM, Joe Stein joe.st...@stealth.ly wrote: Weekly would be great maybe like every Tuesday ~ 1pm ET / 10am PT I don't mind google hangout but there is always some issue or whatever so we know the apache irc channel works. We can start there and see how it goes? We can pull transcripts too and associate to tickets if need be makes it helpful for things. ~ Joestein On Tue, Feb 24, 2015 at 11:10 AM, Jay Kreps jay.kr...@gmail.com wrote: We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands.
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
We'd talked about doing a Google Hangout to chat about this. What about generalizing that a little further...I actually think it would be good for everyone spending a reasonable chunk of their week on Kafka stuff to maybe sync up once a week. I think we could use time to talk through design stuff, make sure we are on top of code reviews, talk through any tricky issues, etc. We can make it publicly available so that any one can follow along who likes. Any interest in doing this? If so I'll try to set it up starting next week. -Jay On Tue, Feb 24, 2015 at 3:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I've updated KIP page, fixed / aligned document structure. Also I added some very initial proposal for AdminClient so we have something to start from while discussing the KIP. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations Thanks, Andrii Biletskyi On Wed, Feb 18, 2015 at 9:01 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty if successful) AlterTopicRequest - TopicNameRegexp Partitions ReplicaAssignment [AddedConfig] [DeletedConfig] AlterTopicResponse - [TopicName ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription CommandErrorCode = int16 CommandErrorDescription = string (nonempty in case of fatal error, e.g. we couldn't get topics by regexp) DescribeTopicRequest - TopicNameRegexp DescribeTopicResponse - [TopicName TopicDescription ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty if successful) AlterTopicRequest - TopicNameRegexp Partitions ReplicaAssignment [AddedConfig] [DeletedConfig] AlterTopicResponse - [TopicName ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription CommandErrorCode = int16 CommandErrorDescription = string (nonempty in case of fatal error, e.g. we couldn't get topics by regexp) DescribeTopicRequest - TopicNameRegexp DescribeTopicResponse - [TopicName TopicDescription ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription Also, any thoughts about our discussion regarding re-routing facility? In my understanding, it is like between augmenting TopicMetadataRequest (to include at least controllerId) and implementing new generic re-routing facility so sending messages to controller will be handled by it. Thanks, Andrii Biletskyi On Mon, Feb 16, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: @Guozhang: Thanks for your comments, I've answered some of those. The main thing is having merged request for create-alter-delete-describe - I have some concerns about this approach. @*Jay*: I see that introduced ClusterMetadaRequest is also one of the concerns. We can solve it if we implement re-routing facility. But I agree with Guozhang - it will make clients' internals a little bit easier but this seems to be a complex logic to implement and support then. Especially for Fetch and Produce (even if we add re-routing later for these requests). Also people will tend to avoid this re-routing facility and hold local cluster cache to ensure their high-priority requests (which some of the admin requests are) not sent to some busy broker where they wait to be routed to the correct one. As pointed out by Jun here ( https://issues.apache.org/jira/browse/KAFKA-1772?focusedCommentId=14234530page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14234530 ) to solve the issue we might introduce a message type to get cluster state. But I agree we can just update TopicMetadataResponse to include controllerId (and probably smth else). What are you thougths? Thanks, Andrii On Thu, Feb 12, 2015 at 8:31 AM, Guozhang Wang wangg...@gmail.com wrote: I think for the topics commands we can actually merge create/alter/delete/describe as one request type since their formats are very much similar, and keep list-topics and others like partition-reassignment / preferred-leader-election as separate request types, I also left some other comments on the RB ( https://reviews.apache.org/r/29301/). On Wed, Feb 11, 2015 at 2:04 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah I totally agree that we don't want to just have one do admin stuff command that has the union of all parameters. What I am saying is that command line tools are one client of the administrative apis, but these will be used in
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jay, Re error messages: you are right, in most cases client will have enough context to show descriptive error message. My concern is that we will have to add lots of new error codes for each possible error. Of course, we could reuse some of existing like UknownTopicOrPartitionCode, but we will also need to add smth like: TopicAlreadyExistsCode, TopicConfigInvalid (both for topic name and config, and probably user would like to know what exactly is wrong in his config), InvalidReplicaAssignment, InternalError (e.g. zookeeper failure) etc. And this is only for TopicCommand, we will also need to add similar stuff for ReassignPartitions, PreferredReplica. So we'll end up with a large list of error codes, used only in Admin protocol. Having said that, I agree my proposal is not consistent with other cases. Maybe we can find better solution or something in-between. Re Hangout chat: I think it is a great idea. This way we can move on faster. Let's agree somehow on date/time so people can join. Will work for me this and next week almost anytime if agreed in advance. Thanks, Andrii On Wed, Feb 18, 2015 at 7:09 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, Generally we can do good error handling without needing custom server-side messages. I.e. generally the client has the context to know that if it got an error that the topic doesn't exist to say Topic X doesn't exist rather than error code 14 (or whatever). Maybe there are specific cases where this is hard? If we want to add server-side error messages we really do need to do this in a consistent way across the protocol. I still have a bunch of open questions here from my previous list. I will be out for the next few days for Strata though. Maybe we could do a Google Hangout chat on any open issues some time towards the end of next week for anyone interested in this ticket? I have a feeling that might progress things a little faster than email--I think we could talk through those issues I brought up fairly quickly... -Jay On Wed, Feb 18, 2015 at 7:27 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty if successful) AlterTopicRequest - TopicNameRegexp Partitions ReplicaAssignment [AddedConfig] [DeletedConfig] AlterTopicResponse - [TopicName ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription CommandErrorCode = int16 CommandErrorDescription = string (nonempty in case of fatal error, e.g. we couldn't get topics by regexp) DescribeTopicRequest - TopicNameRegexp DescribeTopicResponse - [TopicName TopicDescription ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription Also, any thoughts about our discussion regarding re-routing facility? In my understanding, it is like between augmenting TopicMetadataRequest (to include at least controllerId) and implementing new generic re-routing facility so sending messages to controller will be handled by it. Thanks, Andrii Biletskyi On Mon, Feb 16, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: @Guozhang: Thanks for your comments, I've answered some of those. The main thing is having merged request for create-alter-delete-describe - I have some concerns about this approach. @*Jay*: I see that introduced ClusterMetadaRequest is also one of the concerns. We can solve it if we implement re-routing facility. But I agree with Guozhang - it will make clients' internals a little bit easier but this seems to be a complex logic to implement and support then. Especially for Fetch and Produce (even if we add re-routing later for these requests). Also people will tend to avoid this re-routing facility and hold local cluster cache to ensure their high-priority requests (which some of the admin requests are) not sent
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hi all, I'm trying to address some of the issues which were mentioned earlier about Admin RQ/RP format. One of those was about batching operations. What if we follow TopicCommand approach and let people specify topic-name by regexp - would that cover most of the use cases? Secondly, is what information should we generally provide in Admin responses. I realize that Admin commands don't imply they will be used only in CLI but, it seems to me, CLI is a very important client of this feature. In this case, seems logical, we would like to provide users with rich experience in terms of getting results / errors of the executed commands. Usually we supply with responses only errorCode, which looks very limiting, in case of CLI we may want to print human readable error description. So, taking into account previous item about batching, what do you think about having smth like: ('create' doesn't support regexp) CreateTopicRequest = TopicName Partitions Replicas ReplicaAssignment [Config] CreateTopicResponse = ErrorCode ErrorDescription ErrorCode = int16 ErrorDescription = string (empty if successful) AlterTopicRequest - TopicNameRegexp Partitions ReplicaAssignment [AddedConfig] [DeletedConfig] AlterTopicResponse - [TopicName ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription CommandErrorCode = int16 CommandErrorDescription = string (nonempty in case of fatal error, e.g. we couldn't get topics by regexp) DescribeTopicRequest - TopicNameRegexp DescribeTopicResponse - [TopicName TopicDescription ErrorCode ErrorDescription] CommandErrorCode CommandErrorDescription Also, any thoughts about our discussion regarding re-routing facility? In my understanding, it is like between augmenting TopicMetadataRequest (to include at least controllerId) and implementing new generic re-routing facility so sending messages to controller will be handled by it. Thanks, Andrii Biletskyi On Mon, Feb 16, 2015 at 5:26 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: @Guozhang: Thanks for your comments, I've answered some of those. The main thing is having merged request for create-alter-delete-describe - I have some concerns about this approach. @*Jay*: I see that introduced ClusterMetadaRequest is also one of the concerns. We can solve it if we implement re-routing facility. But I agree with Guozhang - it will make clients' internals a little bit easier but this seems to be a complex logic to implement and support then. Especially for Fetch and Produce (even if we add re-routing later for these requests). Also people will tend to avoid this re-routing facility and hold local cluster cache to ensure their high-priority requests (which some of the admin requests are) not sent to some busy broker where they wait to be routed to the correct one. As pointed out by Jun here ( https://issues.apache.org/jira/browse/KAFKA-1772?focusedCommentId=14234530page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14234530) to solve the issue we might introduce a message type to get cluster state. But I agree we can just update TopicMetadataResponse to include controllerId (and probably smth else). What are you thougths? Thanks, Andrii On Thu, Feb 12, 2015 at 8:31 AM, Guozhang Wang wangg...@gmail.com wrote: I think for the topics commands we can actually merge create/alter/delete/describe as one request type since their formats are very much similar, and keep list-topics and others like partition-reassignment / preferred-leader-election as separate request types, I also left some other comments on the RB ( https://reviews.apache.org/r/29301/). On Wed, Feb 11, 2015 at 2:04 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah I totally agree that we don't want to just have one do admin stuff command that has the union of all parameters. What I am saying is that command line tools are one client of the administrative apis, but these will be used in a number of scenarios so they should make logical sense even in the absence of the command line tool. Hence comments like trying to clarify the relationship between ClusterMetadata and TopicMetadata...these kinds of things really need to be thought through. Hope that makes sense. -Jay On Wed, Feb 11, 2015 at 1:41 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Thanks for answering. You understood correctly, most of my comments were related to your point 1) - about well thought-out apis. Also, yes, as I understood we would like to introduce a single unified CLI tool with centralized server-side request handling for lots of existing ones (incl. TopicCommand, CommitOffsetChecker, ReassignPartitions, smth else if added in future). In our previous discussion ( https://issues.apache.org/jira/browse/KAFKA-1694) people said they'd rather have a separate message for each command, so, yes, this way I came to 1-1 mapping between commands in the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
@Guozhang: Thanks for your comments, I've answered some of those. The main thing is having merged request for create-alter-delete-describe - I have some concerns about this approach. @*Jay*: I see that introduced ClusterMetadaRequest is also one of the concerns. We can solve it if we implement re-routing facility. But I agree with Guozhang - it will make clients' internals a little bit easier but this seems to be a complex logic to implement and support then. Especially for Fetch and Produce (even if we add re-routing later for these requests). Also people will tend to avoid this re-routing facility and hold local cluster cache to ensure their high-priority requests (which some of the admin requests are) not sent to some busy broker where they wait to be routed to the correct one. As pointed out by Jun here ( https://issues.apache.org/jira/browse/KAFKA-1772?focusedCommentId=14234530page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14234530) to solve the issue we might introduce a message type to get cluster state. But I agree we can just update TopicMetadataResponse to include controllerId (and probably smth else). What are you thougths? Thanks, Andrii On Thu, Feb 12, 2015 at 8:31 AM, Guozhang Wang wangg...@gmail.com wrote: I think for the topics commands we can actually merge create/alter/delete/describe as one request type since their formats are very much similar, and keep list-topics and others like partition-reassignment / preferred-leader-election as separate request types, I also left some other comments on the RB ( https://reviews.apache.org/r/29301/). On Wed, Feb 11, 2015 at 2:04 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah I totally agree that we don't want to just have one do admin stuff command that has the union of all parameters. What I am saying is that command line tools are one client of the administrative apis, but these will be used in a number of scenarios so they should make logical sense even in the absence of the command line tool. Hence comments like trying to clarify the relationship between ClusterMetadata and TopicMetadata...these kinds of things really need to be thought through. Hope that makes sense. -Jay On Wed, Feb 11, 2015 at 1:41 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Thanks for answering. You understood correctly, most of my comments were related to your point 1) - about well thought-out apis. Also, yes, as I understood we would like to introduce a single unified CLI tool with centralized server-side request handling for lots of existing ones (incl. TopicCommand, CommitOffsetChecker, ReassignPartitions, smth else if added in future). In our previous discussion ( https://issues.apache.org/jira/browse/KAFKA-1694) people said they'd rather have a separate message for each command, so, yes, this way I came to 1-1 mapping between commands in the tool and protocol additions. But I might be wrong. At the end I just try to start discussion how at least generally this protocol should look like. Thanks, Andrii On Wed, Feb 11, 2015 at 11:10 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, To answer your earlier question we just really can't be adding any more scala protocol objects. These things are super hard to maintain because they hand code the byte parsing and don't have good versioning support. Since we are already planning on converting we definitely don't want to add a ton more of these--they are total tech debt. What does it mean that the changes are isolated from the current code base? I actually didn't understand the remaining comments, which of the points are you responding to? Maybe one sticking point here is that it seems like you want to make some kind of tool, and you have made a 1-1 mapping between commands you imagine in the tool and protocol additions. I want to make sure we don't do that. The protocol needs to be really really well thought out against many use cases so it should make perfect logical sense in the absence of knowing the command line tool, right? -Jay On Wed, Feb 11, 2015 at 11:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take (
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
For the Sample usage section, please consider https://github.com/airbnb/kafkat. We find that tool to be very easy to use, and extremely useful for our administration tasks. Chi On Mon, Feb 9, 2015 at 9:03 AM, Guozhang Wang wangg...@gmail.com wrote: I feel the benefits of lowering the development bar for new clients does not worth the complexity we need to introduce in the server side, as today the clients just need one more request type (metadata request) to send the produce / fetch to the right brokers, whereas re-routing mechanism will result in complicated between-brokers communication patterns that potentially impact Kafka performance and making debugging / trouble shooting much harder. An alternative way to ease the development of the clients is to use a proxy in front of the kafka servers, like the rest proxy we have built before, which we use for non-java clients primarily but also can be treated as handling cluster metadata discovery for clients. Comparing to the re-routing idea, the proxy also introduces two-hops but its layered architecture is simpler. Guozhang On Sun, Feb 8, 2015 at 8:00 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Jiangjie, Re routing support doesn't force clients to use it. Java and all existing clients would work as now where request are intelligently routed by the client, but this would lower the bar for new clients. That said I agree the case for reroute get admin commands is much stronger than data. The idea of separating admin/metadata from would definitely solve some problems but it would also add a lot of complexity--new ports, thread pools, etc. this is an interesting idea to think over but I'm not sure if it's worth it. Probably a separate effort in any case. -jay On Friday, February 6, 2015, Jiangjie Qin j...@linkedin.com.invalid wrote: I¹m a little bit concerned about the request routers among brokers. Typically we have a dominant percentage of produce and fetch request/response. Routing them from one broker to another seems not wanted. Also I think we generally have two types of requests/responses: data related and admin related. It is typically a good practice to separate data plain from control plain. That suggests we should have another admin port to serve those admin requests and probably have different authentication/authorization from the data port. Jiangjie (Becket) Qin On 2/6/15, 11:18 AM, Joe Stein joe.st...@stealth.ly wrote: I updated the installation and sample usage for the existing patches on the KIP site https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and +centralized+administrative+operations There are still a few pending items here. 1) There was already some discussion about using the Broker that is the Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we should elaborate on that more in the thread or agree we are ok with admin asking for the controller to talk to and then just sending that broker the admin tasks. 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but we can refactor after KAFK-1694 committed, no? I know folks just want to talk to the broker that is the controller. It may even become useful to have the controller run on a broker that isn't even a topic broker anymore (small can of worms I am opening here but it elaborates on Guozhang's hot spot point. 3) anymore feedback? - Joe Stein On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang wangg...@gmail.com wrote: A centralized admin operation protocol would be very useful. One more general comment here is that controller is originally designed to only talk to other brokers through ControllerChannel, while the broker instance which carries the current controller is agnostic of its existence, and use KafkaApis to handle general Kafka requests. Having all admin requests redirected to the controller instance will force the broker to be aware of its carried controller, and access its internal data for handling these requests. Plus with the number of clients out of Kafka's control, this may easily cause the controller to be a hot spot in terms of request load. On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein joe.st...@stealth.ly wrote: inline On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hey Andrii, To answer your earlier question we just really can't be adding any more scala protocol objects. These things are super hard to maintain because they hand code the byte parsing and don't have good versioning support. Since we are already planning on converting we definitely don't want to add a ton more of these--they are total tech debt. What does it mean that the changes are isolated from the current code base? I actually didn't understand the remaining comments, which of the points are you responding to? Maybe one sticking point here is that it seems like you want to make some kind of tool, and you have made a 1-1 mapping between commands you imagine in the tool and protocol additions. I want to make sure we don't do that. The protocol needs to be really really well thought out against many use cases so it should make perfect logical sense in the absence of knowing the command line tool, right? -Jay On Wed, Feb 11, 2015 at 11:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-ProposedRQ/RPFormat ) as starting point? We had some doubts earlier whether it worth introducing one generic Admin Request for all commands (https://issues.apache.org/jira/browse/KAFKA-1694 ) but then everybody agreed it would be better to have separate message for each admin command. The Request part is really dictated from the command (e.g. TopicCommand) arguments itself, so the proposed version should be fine (let's put aside for now remarks about Optional type, batching, configs normalization - I agree with all of them). So the second part is Response. I see there are two cases here. a) Mutate requests - Create/Alter/... ; b) Get requests - List/Describe... a) should only hold request result (regardless what we decide about blocking/non-blocking commands execution). Usually we provide error code in response but since we will use this in interactive shell we need some human readable error description - so I added errorDesription field where you can at least leave exception.getMessage. b) in addition to previous item message should hold command specific response data. We can discuss in detail each of them but let's for now agree about the overall pattern. Thanks, Andrii Biletskyi On Fri, Jan 23, 2015 at 6:59 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. 6. We should probably normalize the key value pairs used for configs rather than embedding a new formatting.
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-ProposedRQ/RPFormat) as starting point? We had some doubts earlier whether it worth introducing one generic Admin Request for all commands (https://issues.apache.org/jira/browse/KAFKA-1694) but then everybody agreed it would be better to have separate message for each admin command. The Request part is really dictated from the command (e.g. TopicCommand) arguments itself, so the proposed version should be fine (let's put aside for now remarks about Optional type, batching, configs normalization - I agree with all of them). So the second part is Response. I see there are two cases here. a) Mutate requests - Create/Alter/... ; b) Get requests - List/Describe... a) should only hold request result (regardless what we decide about blocking/non-blocking commands execution). Usually we provide error code in response but since we will use this in interactive shell we need some human readable error description - so I added errorDesription field where you can at least leave exception.getMessage. b) in addition to previous item message should hold command specific response data. We can discuss in detail each of them but let's for now agree about the overall pattern. Thanks, Andrii Biletskyi On Fri, Jan 23, 2015 at 6:59 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. 6. We should probably normalize the key value pairs used for configs rather than embedding a new formatting. So two strings rather than one with an internal equals sign. 7. Is the postcondition of these APIs that the command has begun or that the command has been completed? It is a lot more usable if the command has been completed so you know that if you create a topic and then publish to it you won't get an exception about there being no such topic. 8. Describe topic and list topics duplicate a lot of stuff in the metadata request. Is there a reason to give back topics marked for deletion? I feel like if we just make the post-condition of the delete command be that the topic is deleted that will get rid of the need for this right? And it will be much more intuitive. 9. Should we consider batching these requests? We have generally tried to allow multiple operations to be batched. My suspicion is that without this we will get a lot of code that does something like for(topic: adminClient.listTopics()) adminClient.describeTopic(topic) this code will work great when you test on 5 topics but not do as well if you have 50k. 10. I think we should also discuss how we want
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Jay, Thanks for answering. You understood correctly, most of my comments were related to your point 1) - about well thought-out apis. Also, yes, as I understood we would like to introduce a single unified CLI tool with centralized server-side request handling for lots of existing ones (incl. TopicCommand, CommitOffsetChecker, ReassignPartitions, smth else if added in future). In our previous discussion ( https://issues.apache.org/jira/browse/KAFKA-1694) people said they'd rather have a separate message for each command, so, yes, this way I came to 1-1 mapping between commands in the tool and protocol additions. But I might be wrong. At the end I just try to start discussion how at least generally this protocol should look like. Thanks, Andrii On Wed, Feb 11, 2015 at 11:10 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, To answer your earlier question we just really can't be adding any more scala protocol objects. These things are super hard to maintain because they hand code the byte parsing and don't have good versioning support. Since we are already planning on converting we definitely don't want to add a ton more of these--they are total tech debt. What does it mean that the changes are isolated from the current code base? I actually didn't understand the remaining comments, which of the points are you responding to? Maybe one sticking point here is that it seems like you want to make some kind of tool, and you have made a 1-1 mapping between commands you imagine in the tool and protocol additions. I want to make sure we don't do that. The protocol needs to be really really well thought out against many use cases so it should make perfect logical sense in the absence of knowing the command line tool, right? -Jay On Wed, Feb 11, 2015 at 11:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-ProposedRQ/RPFormat ) as starting point? We had some doubts earlier whether it worth introducing one generic Admin Request for all commands ( https://issues.apache.org/jira/browse/KAFKA-1694 ) but then everybody agreed it would be better to have separate message for each admin command. The Request part is really dictated from the command (e.g. TopicCommand) arguments itself, so the proposed version should be fine (let's put aside for now remarks about Optional type, batching, configs normalization - I agree with all of them). So the second part is Response. I see there are two cases here. a) Mutate requests - Create/Alter/... ; b) Get requests - List/Describe... a) should only hold request result (regardless what we decide about blocking/non-blocking commands execution). Usually we provide error code in response but since we will use this in interactive shell we need some human readable error description - so I added errorDesription field where you can at least leave exception.getMessage. b) in addition to previous item message should hold command specific response data. We can discuss in detail each of them but let's for now agree about the overall pattern. Thanks, Andrii Biletskyi On Fri, Jan 23, 2015 at 6:59 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. 4.
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Yeah I totally agree that we don't want to just have one do admin stuff command that has the union of all parameters. What I am saying is that command line tools are one client of the administrative apis, but these will be used in a number of scenarios so they should make logical sense even in the absence of the command line tool. Hence comments like trying to clarify the relationship between ClusterMetadata and TopicMetadata...these kinds of things really need to be thought through. Hope that makes sense. -Jay On Wed, Feb 11, 2015 at 1:41 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Thanks for answering. You understood correctly, most of my comments were related to your point 1) - about well thought-out apis. Also, yes, as I understood we would like to introduce a single unified CLI tool with centralized server-side request handling for lots of existing ones (incl. TopicCommand, CommitOffsetChecker, ReassignPartitions, smth else if added in future). In our previous discussion ( https://issues.apache.org/jira/browse/KAFKA-1694) people said they'd rather have a separate message for each command, so, yes, this way I came to 1-1 mapping between commands in the tool and protocol additions. But I might be wrong. At the end I just try to start discussion how at least generally this protocol should look like. Thanks, Andrii On Wed, Feb 11, 2015 at 11:10 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, To answer your earlier question we just really can't be adding any more scala protocol objects. These things are super hard to maintain because they hand code the byte parsing and don't have good versioning support. Since we are already planning on converting we definitely don't want to add a ton more of these--they are total tech debt. What does it mean that the changes are isolated from the current code base? I actually didn't understand the remaining comments, which of the points are you responding to? Maybe one sticking point here is that it seems like you want to make some kind of tool, and you have made a 1-1 mapping between commands you imagine in the tool and protocol additions. I want to make sure we don't do that. The protocol needs to be really really well thought out against many use cases so it should make perfect logical sense in the absence of knowing the command line tool, right? -Jay On Wed, Feb 11, 2015 at 11:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-ProposedRQ/RPFormat ) as starting point? We had some doubts earlier whether it worth introducing one generic Admin Request for all commands ( https://issues.apache.org/jira/browse/KAFKA-1694 ) but then everybody agreed it would be better to have separate message for each admin command. The Request part is really dictated from the command (e.g. TopicCommand) arguments itself, so the proposed version should be fine (let's put aside for now remarks about Optional type, batching, configs normalization - I agree with all of them). So the second part is Response. I see there are two cases here. a) Mutate requests - Create/Alter/... ; b) Get requests - List/Describe... a) should only hold request result (regardless what we decide about blocking/non-blocking commands execution). Usually we provide error code in response but since we will use this in interactive shell we need some human readable error description - so I added errorDesription field where you can at least leave exception.getMessage. b) in addition to previous item message should hold command specific response data. We can discuss in detail each of them but let's for now agree about the overall pattern. Thanks, Andrii Biletskyi On Fri, Jan 23, 2015 at 6:59 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I think for the topics commands we can actually merge create/alter/delete/describe as one request type since their formats are very much similar, and keep list-topics and others like partition-reassignment / preferred-leader-election as separate request types, I also left some other comments on the RB ( https://reviews.apache.org/r/29301/). On Wed, Feb 11, 2015 at 2:04 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah I totally agree that we don't want to just have one do admin stuff command that has the union of all parameters. What I am saying is that command line tools are one client of the administrative apis, but these will be used in a number of scenarios so they should make logical sense even in the absence of the command line tool. Hence comments like trying to clarify the relationship between ClusterMetadata and TopicMetadata...these kinds of things really need to be thought through. Hope that makes sense. -Jay On Wed, Feb 11, 2015 at 1:41 PM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Jay, Thanks for answering. You understood correctly, most of my comments were related to your point 1) - about well thought-out apis. Also, yes, as I understood we would like to introduce a single unified CLI tool with centralized server-side request handling for lots of existing ones (incl. TopicCommand, CommitOffsetChecker, ReassignPartitions, smth else if added in future). In our previous discussion ( https://issues.apache.org/jira/browse/KAFKA-1694) people said they'd rather have a separate message for each command, so, yes, this way I came to 1-1 mapping between commands in the tool and protocol additions. But I might be wrong. At the end I just try to start discussion how at least generally this protocol should look like. Thanks, Andrii On Wed, Feb 11, 2015 at 11:10 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Andrii, To answer your earlier question we just really can't be adding any more scala protocol objects. These things are super hard to maintain because they hand code the byte parsing and don't have good versioning support. Since we are already planning on converting we definitely don't want to add a ton more of these--they are total tech debt. What does it mean that the changes are isolated from the current code base? I actually didn't understand the remaining comments, which of the points are you responding to? Maybe one sticking point here is that it seems like you want to make some kind of tool, and you have made a 1-1 mapping between commands you imagine in the tool and protocol additions. I want to make sure we don't do that. The protocol needs to be really really well thought out against many use cases so it should make perfect logical sense in the absence of knowing the command line tool, right? -Jay On Wed, Feb 11, 2015 at 11:57 AM, Andrii Biletskyi andrii.bilets...@stealth.ly wrote: Hey Jay, I would like to continue this discussion as it seem there is no progress here. First of all, could you please explain what did you mean in 2? How exactly are we going to migrate to the new java protocol definitions. And why it's a blocker for centralized CLI? I agree with you, this feature includes lots of stuff, but thankfully almost all changes are isolated from the current code base, so the main thing, I think, we need to agree is RQ/RP format. So how can we start discussion about the concrete messages format? Can we take ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-ProposedRQ/RPFormat ) as starting point? We had some doubts earlier whether it worth introducing one generic Admin Request for all commands ( https://issues.apache.org/jira/browse/KAFKA-1694 ) but then everybody agreed it would be better to have separate message for each admin command. The Request part is really dictated from the command (e.g. TopicCommand) arguments itself, so the proposed version should be fine (let's put aside for now remarks about Optional type, batching, configs normalization - I agree with all of them). So the second part is Response. I see there are two cases here. a) Mutate requests - Create/Alter/... ; b) Get requests - List/Describe... a) should only hold request result (regardless what we decide about blocking/non-blocking commands execution). Usually we provide error code in response but since we will use this in interactive shell we need some human readable error description - so I added errorDesription field where you can at least leave exception.getMessage. b) in addition to previous item message should hold command specific response data. We can
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I feel the benefits of lowering the development bar for new clients does not worth the complexity we need to introduce in the server side, as today the clients just need one more request type (metadata request) to send the produce / fetch to the right brokers, whereas re-routing mechanism will result in complicated between-brokers communication patterns that potentially impact Kafka performance and making debugging / trouble shooting much harder. An alternative way to ease the development of the clients is to use a proxy in front of the kafka servers, like the rest proxy we have built before, which we use for non-java clients primarily but also can be treated as handling cluster metadata discovery for clients. Comparing to the re-routing idea, the proxy also introduces two-hops but its layered architecture is simpler. Guozhang On Sun, Feb 8, 2015 at 8:00 AM, Jay Kreps jay.kr...@gmail.com wrote: Hey Jiangjie, Re routing support doesn't force clients to use it. Java and all existing clients would work as now where request are intelligently routed by the client, but this would lower the bar for new clients. That said I agree the case for reroute get admin commands is much stronger than data. The idea of separating admin/metadata from would definitely solve some problems but it would also add a lot of complexity--new ports, thread pools, etc. this is an interesting idea to think over but I'm not sure if it's worth it. Probably a separate effort in any case. -jay On Friday, February 6, 2015, Jiangjie Qin j...@linkedin.com.invalid wrote: I¹m a little bit concerned about the request routers among brokers. Typically we have a dominant percentage of produce and fetch request/response. Routing them from one broker to another seems not wanted. Also I think we generally have two types of requests/responses: data related and admin related. It is typically a good practice to separate data plain from control plain. That suggests we should have another admin port to serve those admin requests and probably have different authentication/authorization from the data port. Jiangjie (Becket) Qin On 2/6/15, 11:18 AM, Joe Stein joe.st...@stealth.ly wrote: I updated the installation and sample usage for the existing patches on the KIP site https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and +centralized+administrative+operations There are still a few pending items here. 1) There was already some discussion about using the Broker that is the Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we should elaborate on that more in the thread or agree we are ok with admin asking for the controller to talk to and then just sending that broker the admin tasks. 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but we can refactor after KAFK-1694 committed, no? I know folks just want to talk to the broker that is the controller. It may even become useful to have the controller run on a broker that isn't even a topic broker anymore (small can of worms I am opening here but it elaborates on Guozhang's hot spot point. 3) anymore feedback? - Joe Stein On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang wangg...@gmail.com wrote: A centralized admin operation protocol would be very useful. One more general comment here is that controller is originally designed to only talk to other brokers through ControllerChannel, while the broker instance which carries the current controller is agnostic of its existence, and use KafkaApis to handle general Kafka requests. Having all admin requests redirected to the controller instance will force the broker to be aware of its carried controller, and access its internal data for handling these requests. Plus with the number of clients out of Kafka's control, this may easily cause the controller to be a hot spot in terms of request load. On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein joe.st...@stealth.ly wrote: inline On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. ok 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. ok :) 3. This proposal
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
I¹m a little bit concerned about the request routers among brokers. Typically we have a dominant percentage of produce and fetch request/response. Routing them from one broker to another seems not wanted. Also I think we generally have two types of requests/responses: data related and admin related. It is typically a good practice to separate data plain from control plain. That suggests we should have another admin port to serve those admin requests and probably have different authentication/authorization from the data port. Jiangjie (Becket) Qin On 2/6/15, 11:18 AM, Joe Stein joe.st...@stealth.ly wrote: I updated the installation and sample usage for the existing patches on the KIP site https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and +centralized+administrative+operations There are still a few pending items here. 1) There was already some discussion about using the Broker that is the Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we should elaborate on that more in the thread or agree we are ok with admin asking for the controller to talk to and then just sending that broker the admin tasks. 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but we can refactor after KAFK-1694 committed, no? I know folks just want to talk to the broker that is the controller. It may even become useful to have the controller run on a broker that isn't even a topic broker anymore (small can of worms I am opening here but it elaborates on Guozhang's hot spot point. 3) anymore feedback? - Joe Stein On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang wangg...@gmail.com wrote: A centralized admin operation protocol would be very useful. One more general comment here is that controller is originally designed to only talk to other brokers through ControllerChannel, while the broker instance which carries the current controller is agnostic of its existence, and use KafkaApis to handle general Kafka requests. Having all admin requests redirected to the controller instance will force the broker to be aware of its carried controller, and access its internal data for handling these requests. Plus with the number of clients out of Kafka's control, this may easily cause the controller to be a hot spot in terms of request load. On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein joe.st...@stealth.ly wrote: inline On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. ok 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. ok :) 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. yes, less confusing, ok. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? We could add broker version to it. 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. If we do that then we should also preserve what we have and do both. The client can then decide do I want to go to any broker and proxy or just go to controller and run admin task. Lots of folks have seen controllers come under distress because of their producers/consumers. There is ticket too for controller elect and re-elect https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it to a broker that has 0 load. 6. We should probably normalize the key value pairs used for
Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations
Hey Joe, I think this is proposing several things: 1. A new command line utility. This isn't really fully specified here. There is sample usage but I actually don't really understand what all the commands will be. Also, presumably this will replace the existing shell scripts, right? We obviously don't want to be in a state where we have both... 2. A new set of language agnostic administrative protocols. 3. A new Java API for issuing administrative requests using the protocol. I don't see any discussion on what this will look like. It might be easiest to tackle these one at a time, no? If not we really do need to get a complete description at each layer as these are pretty core public apis. -Jay On Fri, Feb 6, 2015 at 11:18 AM, Joe Stein joe.st...@stealth.ly wrote: I updated the installation and sample usage for the existing patches on the KIP site https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations There are still a few pending items here. 1) There was already some discussion about using the Broker that is the Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we should elaborate on that more in the thread or agree we are ok with admin asking for the controller to talk to and then just sending that broker the admin tasks. 2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but we can refactor after KAFK-1694 committed, no? I know folks just want to talk to the broker that is the controller. It may even become useful to have the controller run on a broker that isn't even a topic broker anymore (small can of worms I am opening here but it elaborates on Guozhang's hot spot point. 3) anymore feedback? - Joe Stein On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang wangg...@gmail.com wrote: A centralized admin operation protocol would be very useful. One more general comment here is that controller is originally designed to only talk to other brokers through ControllerChannel, while the broker instance which carries the current controller is agnostic of its existence, and use KafkaApis to handle general Kafka requests. Having all admin requests redirected to the controller instance will force the broker to be aware of its carried controller, and access its internal data for handling these requests. Plus with the number of clients out of Kafka's control, this may easily cause the controller to be a hot spot in terms of request load. On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein joe.st...@stealth.ly wrote: inline On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Joe, This is great. A few comments on KIP-4 1. This is much needed functionality, but there are a lot of the so let's really think these protocols through. We really want to end up with a set of well thought-out, orthoganol apis. For this reason I think it is really important to think through the end state even if that includes APIs we won't implement in the first phase. ok 2. Let's please please please wait until we have switched the server over to the new java protocol definitions. If we add upteen more ad hoc scala objects that is just generating more work for the conversion we know we have to do. ok :) 3. This proposal introduces a new type of optional parameter. This is inconsistent with everything else in the protocol where we use -1 or some other marker value. You could argue either way but let's stick with that for consistency. For clients that implemented the protocol in a better way than our scala code these basic primitives are hard to change. yes, less confusing, ok. 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has brokers, topics, and partitions. I think we should rename that request ClusterMetadataRequest (or just MetadataRequest) and include the id of the controller. Or are there other things we could add here? We could add broker version to it. 5. We have a tendency to try to make a lot of requests that can only go to particular nodes. This adds a lot of burden for client implementations (it sounds easy but each discovery can fail in many parts so it ends up being a full state machine to do right). I think we should consider making admin commands and ideally as many of the other apis as possible available on all brokers and just redirect to the controller on the broker side. Perhaps there would be a general way to encapsulate this re-routing behavior. If we do that then we should also preserve what we have and do both. The client can then decide do I want to go to any broker and proxy or just go to controller and run admin task. Lots of folks have seen controllers