Re: [DISCUSSION] KIP-619: Add internal topic creation support

2020-08-19 Thread Cheng Tan
Hi David,


Thanks for the feedback. They are really helpful.

> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?

Yes. The user-defined internal topics (those except `__consumer_offsets` and 
`__transaction_state`) will behave as normal topics in regard to messaging 
operation and permission. Topics are marked as “internal” in order to make the 
broker able to test user-defined internal topics and better provide metadata 
services, such as `listTopics` API. I should have added the metadata behavior 
difference in the KIP.

> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).

Thanks for the suggestion. I updated the section.

> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.


I think we should probably give clients the freedom to configure 
`min.insync.replicas`, `replication.factor`, and `log.retention` on 
user-defined internal topics as they do on normal topics.

1. Users may have performance requirements on user-defined internal topics.
2. Potential new defaults / restrictions may change the existing user 
application logic silently. There might be compatibility issues.
3. Since user-defined internal topics act like normal topics and won’t affect 
the messaging functionality (produce, consume, transaction, etc), unoptimized 
log configurations won’t harm the cluster. 


Please let me know what you think. Thanks.


Best, - Cheng Tan



> On Aug 14, 2020, at 7:44 AM, David Arthur  wrote:
> 
> Cheng,
> 
> Can you clarify a bit more what the difference is between regular topics
> and internal topics (excluding  __consumer_offsets and
> __transaction_state)? Reading your last message, if internal topics
> (excluding the two) can be created, deleted, produced to, consumed from,
> added to transactions, I'm failing to see what is different about them. Is
> it simply that they are marked as "internal" so the application can treat
> them differently?
> 
> 
> In the "Compatibility, Deprecation, and Migration" section, we should
> detail how users can overcome this incompatibility (i.e., changing the
> config name on their topic and changing their application logic if
> necessary).
> 
> 
> Should we consider adding any configs to constrain the min isr and
> replication factor for internal topics? If a topic is really internal and
> fundamentally required for an application to function, it might need a more
> stringent replication config. Our existing internal topics have their own
> configs in server.properties with a comment saying as much.
> 
> 
> Thanks!
> David
> 
> 
> 
> On Tue, Jul 7, 2020 at 1:40 PM Cheng Tan  wrote:
> 
>> Hi Colin,
>> 
>> 
>> Thanks for the comments. I’ve modified the KIP accordingly.
>> 
>>> I think we need to understand which of these limitations we will carry
>> forward and which we will not.  We also have the option of putting
>> limitations just on consumer offsets, but not on other internal topics.
>> 
>> 
>> In the proposal, I added details about this. I agree that cluster admin
>> should use ACLs to apply the restrictions.
>> Internal topic creation will be allowed.
>> Internal topic deletion will be allowed except for` __consumer_offsets`
>> and `__transaction_state`.
>> Producing to internal topic partitions other than `__consumer_offsets` and
>> `__transaction_state` will be allowed.
>> Adding internal topic partitions to transactions will be allowed.
>>> I think there are a fair number of compatibility concerns.  What's the
>> result if someone tries to create a topic with the configuration internal =
>> true right now?  Does it fail?  If not, that seems like a potential problem.
>> 
>> I also added this compatibility issue in the "Compatibility, Deprecation,
>> and Migration Plan" section.
>> 
>> Please feel free to make any suggestions or comments regarding to my
>> latest proposal. Thanks.
>> 
>> 
>> Best, - Cheng Tan
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Jun 15, 2020, at 11:18 AM, Colin McCabe  wrote:
>>> 
>>> Hi Cheng,
>>> 
>>> The link from the main KIP page is an "edit link" meaning that it drops
>> you into the editor for the wiki page.  I think the link you meant to use
>> is a "view link" that will just 

Re: [DISCUSSION] KIP-619: Add internal topic creation support

2020-08-14 Thread David Arthur
Cheng,

Can you clarify a bit more what the difference is between regular topics
and internal topics (excluding  __consumer_offsets and
__transaction_state)? Reading your last message, if internal topics
(excluding the two) can be created, deleted, produced to, consumed from,
added to transactions, I'm failing to see what is different about them. Is
it simply that they are marked as "internal" so the application can treat
them differently?


In the "Compatibility, Deprecation, and Migration" section, we should
detail how users can overcome this incompatibility (i.e., changing the
config name on their topic and changing their application logic if
necessary).


Should we consider adding any configs to constrain the min isr and
replication factor for internal topics? If a topic is really internal and
fundamentally required for an application to function, it might need a more
stringent replication config. Our existing internal topics have their own
configs in server.properties with a comment saying as much.


Thanks!
David



On Tue, Jul 7, 2020 at 1:40 PM Cheng Tan  wrote:

> Hi Colin,
>
>
> Thanks for the comments. I’ve modified the KIP accordingly.
>
> > I think we need to understand which of these limitations we will carry
> forward and which we will not.  We also have the option of putting
> limitations just on consumer offsets, but not on other internal topics.
>
>
> In the proposal, I added details about this. I agree that cluster admin
> should use ACLs to apply the restrictions.
> Internal topic creation will be allowed.
> Internal topic deletion will be allowed except for` __consumer_offsets`
> and `__transaction_state`.
> Producing to internal topic partitions other than `__consumer_offsets` and
> `__transaction_state` will be allowed.
> Adding internal topic partitions to transactions will be allowed.
> > I think there are a fair number of compatibility concerns.  What's the
> result if someone tries to create a topic with the configuration internal =
> true right now?  Does it fail?  If not, that seems like a potential problem.
>
> I also added this compatibility issue in the "Compatibility, Deprecation,
> and Migration Plan" section.
>
> Please feel free to make any suggestions or comments regarding to my
> latest proposal. Thanks.
>
>
> Best, - Cheng Tan
>
>
>
>
>
>
> > On Jun 15, 2020, at 11:18 AM, Colin McCabe  wrote:
> >
> > Hi Cheng,
> >
> > The link from the main KIP page is an "edit link" meaning that it drops
> you into the editor for the wiki page.  I think the link you meant to use
> is a "view link" that will just take you to view the page.
> >
> > In general I'm not sure what I'm supposed to take away from the large
> UML diagram in the KIP.  This is just a description of the existing code,
> right?  Seems like we should remove this.
> >
> > I'm not sure why the controller classes are featured here since as far
> as I can tell, the controller doesn't need to care if a topic is internal.
> >
> >> Kafka and its upstream applications treat internal topics differently
> from
> >> non-internal topics. For example:
> >> * Kafka handles topic creation response errors differently for internal
> topics
> >> * Internal topic partitions cannot be added to a transaction
> >> * Internal topic records cannot be deleted
> >> * Appending to internal topics might get rejected
> >
> > I think we need to understand which of these limitations we will carry
> forward and which we will not.  We also have the option of putting
> limitations just on consumer offsets, but not on other internal topics.
> >
> > Taking it one by one:
> >
> >> * Kafka handles topic creation response errors differently for internal
> topics.
> >
> > Hmm.  Kafka doesn't currently allow you to create internal topics, so
> the difference here is that you always fail, right?  Or is there something
> else more subtle here?  Like do we specifically prevent you from creating
> topics named __consumer_offsets or something?  We need to spell this all
> out in the KIP.
> >
> >> * Internal topic partitions cannot be added to a transaction
> >
> > I don't think we should carry this limitation forward, or if we do, we
> should only do it for consumer-offsets.  Does anyone know why this
> limitation exists?
> >
> >> * Internal topic records cannot be deleted
> >
> > This seems like something that should be handled by ACLs rather than by
> treating internal topics specially.
> >
> >> * Appending to internal topics might get rejected
> >
> > We clearly need to use ACLs here rather than rejecting appends.
> Otherwise, how will external systems like KSQL, streams, etc. use this
> feature?  This is the kind of information we need to have in the KIP.
> >
> >> Public Interfaces
> >> 2. KafkaZkClient will have a new method getInternalTopics() which
> >> returns a set of internal topic name strings.
> >
> > KafkaZkClient isn't a public interface, so it doesn't need to be
> described here.
> >
> >> There are no compatibility concerns in this KIP.
> >
> 

Re: [DISCUSSION] KIP-619: Add internal topic creation support

2020-07-07 Thread Cheng Tan
Hi Colin,


Thanks for the comments. I’ve modified the KIP accordingly.

> I think we need to understand which of these limitations we will carry 
> forward and which we will not.  We also have the option of putting 
> limitations just on consumer offsets, but not on other internal topics.


In the proposal, I added details about this. I agree that cluster admin should 
use ACLs to apply the restrictions. 
Internal topic creation will be allowed.
Internal topic deletion will be allowed except for` __consumer_offsets` and 
`__transaction_state`.
Producing to internal topic partitions other than `__consumer_offsets` and 
`__transaction_state` will be allowed.
Adding internal topic partitions to transactions will be allowed.
> I think there are a fair number of compatibility concerns.  What's the result 
> if someone tries to create a topic with the configuration internal = true 
> right now?  Does it fail?  If not, that seems like a potential problem.

I also added this compatibility issue in the "Compatibility, Deprecation, and 
Migration Plan" section.

Please feel free to make any suggestions or comments regarding to my latest 
proposal. Thanks.


Best, - Cheng Tan






> On Jun 15, 2020, at 11:18 AM, Colin McCabe  wrote:
> 
> Hi Cheng,
> 
> The link from the main KIP page is an "edit link" meaning that it drops you 
> into the editor for the wiki page.  I think the link you meant to use is a 
> "view link" that will just take you to view the page.
> 
> In general I'm not sure what I'm supposed to take away from the large UML 
> diagram in the KIP.  This is just a description of the existing code, right?  
> Seems like we should remove this.
> 
> I'm not sure why the controller classes are featured here since as far as I 
> can tell, the controller doesn't need to care if a topic is internal.
> 
>> Kafka and its upstream applications treat internal topics differently from
>> non-internal topics. For example:
>> * Kafka handles topic creation response errors differently for internal 
>> topics
>> * Internal topic partitions cannot be added to a transaction
>> * Internal topic records cannot be deleted
>> * Appending to internal topics might get rejected
> 
> I think we need to understand which of these limitations we will carry 
> forward and which we will not.  We also have the option of putting 
> limitations just on consumer offsets, but not on other internal topics.
> 
> Taking it one by one:
> 
>> * Kafka handles topic creation response errors differently for internal 
>> topics.
> 
> Hmm.  Kafka doesn't currently allow you to create internal topics, so the 
> difference here is that you always fail, right?  Or is there something else 
> more subtle here?  Like do we specifically prevent you from creating topics 
> named __consumer_offsets or something?  We need to spell this all out in the 
> KIP.
> 
>> * Internal topic partitions cannot be added to a transaction
> 
> I don't think we should carry this limitation forward, or if we do, we should 
> only do it for consumer-offsets.  Does anyone know why this limitation exists?
> 
>> * Internal topic records cannot be deleted
> 
> This seems like something that should be handled by ACLs rather than by 
> treating internal topics specially.
> 
>> * Appending to internal topics might get rejected
> 
> We clearly need to use ACLs here rather than rejecting appends.  Otherwise, 
> how will external systems like KSQL, streams, etc. use this feature?  This is 
> the kind of information we need to have in the KIP.
> 
>> Public Interfaces
>> 2. KafkaZkClient will have a new method getInternalTopics() which 
>> returns a set of internal topic name strings.
> 
> KafkaZkClient isn't a public interface, so it doesn't need to be described 
> here.
> 
>> There are no compatibility concerns in this KIP.
> 
> I think there are a fair number of compatibility concerns.  What's the result 
> if someone tries to create a topic with the configuration internal = true 
> right now?  Does it fail?  If not, that seems like a potential problem.
> 
> Are people going to be able to create or delete topics named 
> __consumer_offsets or __transaction_state using this mechanism?  If so, how 
> does the security model work for that?
> 
> best,
> Colin
> 
> On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
>> Hello developers,
>> 
>> 
>> I’m proposing KIP-619 to add internal topic creation support. 
>> 
>> Kafka and its upstream applications treat internal topics differently 
>> from non-internal topics. For example:
>> 
>>  • Kafka handles topic creation response errors differently for internal 
>> topics
>>  • Internal topic partitions cannot be added to a transaction
>>  • Internal topic records cannot be deleted
>>  • Appending to internal topics might get rejected
>>  • ……
>> 
>> Clients and upstream applications may define their own internal topics. 
>> For example, Kafka Connect defines `connect-configs`, 
>> `connect-offsets`, and `connect-statuses`. 

Re: [DISCUSSION] KIP-619: Add internal topic creation support

2020-06-15 Thread Colin McCabe
Hi Cheng,

The link from the main KIP page is an "edit link" meaning that it drops you 
into the editor for the wiki page.  I think the link you meant to use is a 
"view link" that will just take you to view the page.

In general I'm not sure what I'm supposed to take away from the large UML 
diagram in the KIP.  This is just a description of the existing code, right?  
Seems like we should remove this.

I'm not sure why the controller classes are featured here since as far as I can 
tell, the controller doesn't need to care if a topic is internal.

> Kafka and its upstream applications treat internal topics differently from
> non-internal topics. For example:
> * Kafka handles topic creation response errors differently for internal topics
> * Internal topic partitions cannot be added to a transaction
> * Internal topic records cannot be deleted
> * Appending to internal topics might get rejected

I think we need to understand which of these limitations we will carry forward 
and which we will not.  We also have the option of putting limitations just on 
consumer offsets, but not on other internal topics.

Taking it one by one:

> * Kafka handles topic creation response errors differently for internal 
> topics.

Hmm.  Kafka doesn't currently allow you to create internal topics, so the 
difference here is that you always fail, right?  Or is there something else 
more subtle here?  Like do we specifically prevent you from creating topics 
named __consumer_offsets or something?  We need to spell this all out in the 
KIP.

> * Internal topic partitions cannot be added to a transaction

I don't think we should carry this limitation forward, or if we do, we should 
only do it for consumer-offsets.  Does anyone know why this limitation exists?

> * Internal topic records cannot be deleted

This seems like something that should be handled by ACLs rather than by 
treating internal topics specially.

> * Appending to internal topics might get rejected

We clearly need to use ACLs here rather than rejecting appends.  Otherwise, how 
will external systems like KSQL, streams, etc. use this feature?  This is the 
kind of information we need to have in the KIP.

> Public Interfaces
> 2. KafkaZkClient will have a new method getInternalTopics() which 
> returns a set of internal topic name strings.

KafkaZkClient isn't a public interface, so it doesn't need to be described here.

> There are no compatibility concerns in this KIP.

I think there are a fair number of compatibility concerns.  What's the result 
if someone tries to create a topic with the configuration internal = true right 
now?  Does it fail?  If not, that seems like a potential problem.

Are people going to be able to create or delete topics named __consumer_offsets 
or __transaction_state using this mechanism?  If so, how does the security 
model work for that?

best,
Colin

On Fri, May 29, 2020, at 01:09, Cheng Tan wrote:
> Hello developers,
> 
> 
> I’m proposing KIP-619 to add internal topic creation support. 
> 
> Kafka and its upstream applications treat internal topics differently 
> from non-internal topics. For example:
> 
>   • Kafka handles topic creation response errors differently for internal 
> topics
>   • Internal topic partitions cannot be added to a transaction
>   • Internal topic records cannot be deleted
>   • Appending to internal topics might get rejected
>   • ……
> 
> Clients and upstream applications may define their own internal topics. 
> For example, Kafka Connect defines `connect-configs`, 
> `connect-offsets`, and `connect-statuses`. Clients are fetching the 
> internal topics by sending the MetadataRequest (ApiKeys.METADATA).
> 
> However, clients and upstream application cannot register their own 
> internal topics in servers. As a result, servers have no knowledge 
> about client-defined internal topics. They can only test if a given 
> topic is internal or not simply by checking against a static set of 
> internal topic string, which consists of two internal topic names 
> `__consumer_offsets` and `__transaction_state`. As a result, 
> MetadataRequest cannot provide any information about client created 
> internal topics. 
> 
> To solve this pain point, I'm proposing support for clients to register 
> and query their own internal topics. 
> 
> Please feel free to join the discussion. Thanks in advance.
> 
> 
> Best, - Cheng Tan


Re: [DISCUSSION] KIP-619: Add internal topic creation support

2020-06-12 Thread David Arthur
Cheng, thanks for the KIP!

Can you include some details about how this will work the post-ZK world?

For KafkaAdminClient, will we add a new "internal" field to NewTopic, or
will we reuse the existing "configs" map. One concern with sticking this
new special field in the topic configs is that we can collide with an
existing user-defined "internal" config. Also, what happens if a user tries
to alter the config on a topic and changes or removes the "internal"
config?

If we do not want to separate out "internal" into its own field, I think
we'll have to add some guards against users messing with it. It's probably
safer to keep it separate. WDYT?

-David

On Fri, May 29, 2020 at 4:09 AM Cheng Tan  wrote:

> Hello developers,
>
>
> I’m proposing KIP-619 to add internal topic creation support.
>
> Kafka and its upstream applications treat internal topics differently from
> non-internal topics. For example:
>
> • Kafka handles topic creation response errors differently for
> internal topics
> • Internal topic partitions cannot be added to a transaction
> • Internal topic records cannot be deleted
> • Appending to internal topics might get rejected
> • ……
>
> Clients and upstream applications may define their own internal topics.
> For example, Kafka Connect defines `connect-configs`, `connect-offsets`,
> and `connect-statuses`. Clients are fetching the internal topics by sending
> the MetadataRequest (ApiKeys.METADATA).
>
> However, clients and upstream application cannot register their own
> internal topics in servers. As a result, servers have no knowledge about
> client-defined internal topics. They can only test if a given topic is
> internal or not simply by checking against a static set of internal topic
> string, which consists of two internal topic names `__consumer_offsets` and
> `__transaction_state`. As a result, MetadataRequest cannot provide any
> information about client created internal topics.
>
> To solve this pain point, I'm proposing support for clients to register
> and query their own internal topics.
>
> Please feel free to join the discussion. Thanks in advance.
>
>
> Best, - Cheng Tan



-- 
-David


[DISCUSSION] KIP-619: Add internal topic creation support

2020-05-29 Thread Cheng Tan
Hello developers,


I’m proposing KIP-619 to add internal topic creation support. 

Kafka and its upstream applications treat internal topics differently from 
non-internal topics. For example:

• Kafka handles topic creation response errors differently for internal 
topics
• Internal topic partitions cannot be added to a transaction
• Internal topic records cannot be deleted
• Appending to internal topics might get rejected
• ……

Clients and upstream applications may define their own internal topics. For 
example, Kafka Connect defines `connect-configs`, `connect-offsets`, and 
`connect-statuses`. Clients are fetching the internal topics by sending the 
MetadataRequest (ApiKeys.METADATA).

However, clients and upstream application cannot register their own internal 
topics in servers. As a result, servers have no knowledge about client-defined 
internal topics. They can only test if a given topic is internal or not simply 
by checking against a static set of internal topic string, which consists of 
two internal topic names `__consumer_offsets` and `__transaction_state`. As a 
result, MetadataRequest cannot provide any information about client created 
internal topics. 

To solve this pain point, I'm proposing support for clients to register and 
query their own internal topics. 

Please feel free to join the discussion. Thanks in advance.


Best, - Cheng Tan