Hey Ashish, thanks for the write-up. I think having a namespace capability is a useful feature for Kafka, in particular with the addition of the authorization layer. I probably prefer Jay's hierarchical approach if we're going to embed the namespace in the topic name since it seems more general. That said, one advantage of having a namespace independent of the topic name is that it simplifies replication between namespaces a bit since you don't have to parse and rewrite topic names. Assuming that hierarchical topics will happen eventually anyway, I imagine a common pattern would be to preserve the same directory structure in multiple namespaces, so having an easy mechanism for applications to switch between them would be nice. The namespace is kind of analogous to a chroot in this case. Of course you can achieve the same thing by having a configurable topic prefix, just you have to do all the topic rewriting, which I'm guessing will be a little annoying to implement in all of the clients and tools. However, the tradeoff (as you mention in the KIP) is that all request schemas have to be updated, which is also annoying.
-Jason On Wed, Oct 14, 2015 at 12:03 AM, Ashish Singh <asi...@cloudera.com> wrote: > On Mon, Oct 12, 2015 at 7:37 PM, Gwen Shapira <g...@confluent.io> wrote: > > > This works really nicely from the consumer side, but what about the > > producer? If there are no more topics,do we allow producing to a > directory > > and have the Partitioner hash-partition messages between all partitions > in > > the multiple levels in a directory? > > > Good point. > > I am personally in favor of maintaining current behavior for producer, > i.e., letting users to only produce to a topic. This is different for > consumers, the suggested behavior is inline with current behavior. One can > use regex subscription to achieve the same even today. > > > > > Also, I think we want to preserve the consumer terminology of "subscribe" > > to topics / directories, but "assign" partitions - since the consumer > > behavior is different in those cases. > > > > On Mon, Oct 12, 2015 at 7:16 PM, Jay Kreps <j...@confluent.io> wrote: > > > > > Okay this is similar to what I think we have talked about before. Let > me > > > elaborate on the idea that I think has been floating around--it's > pretty > > > similar with a few differences. > > > > > > I think what you are calling the "default namespace" is basically what > I > > > would call the "current working directory" with paths not beginning > with > > > '/' being interpreted relative to this directory as in the fs. > > > > > > One thing you have to work out is what levels in this hierarchy you can > > > actually subscribe to. I think you are assuming only what we currently > > > consider a "topic", i.e. the first level of directories but not the > > > partitions or parent dirs, would be subscribable. If you think about > it, > > > though, that constraint is a bit arbitrary. > > > > > > I'd propose instead the semantics that: > > > - Subscribing to /a/b/c/0 means subscribing to the 0th partition of > topic > > > "c" in directory /a/b > > > - Subscribing to /a/b/c means subscribing to all partitions in > > > topic/directory "c" > > > - Subscribing to /a/b means subscribing to all partitions in all > > > topics/subdirectories under a/b recursively > > > > > > Effectively the concept of topics goes away entirely--you just have > > > partitions/logs and directories. In this respect rather than adding new > > > concepts this new feature would actually just generalizes what we have > > > (which I think is a good thing). > > > > > > -Jay > > > > > > On Mon, Oct 12, 2015 at 6:24 PM, Ashish Singh <asi...@cloudera.com> > > wrote: > > > > > > > On Mon, Oct 12, 2015 at 5:42 PM, Jay Kreps <j...@confluent.io> wrote: > > > > > > > > > Great. I definitely would strongly favor carrying over user's > > intuition > > > > > from FS unless we think we need a very different model. The minor > > > details > > > > > like the seperator and namespace term will help with that. > > > > > > > > > > Follow-up question, say I have a layout like > > > > > /chicago-datacenter/user-events/pageviews > > > > > Can I subscribe to > > > > > /chicago-datacenter/user-events > > > > > > > > > Yes, however they will have need a regex like > > > > /chicago-datacenter/user-events/* > > > > > > > > > to get the full firehose of user events from chicago? Can I > subscribe > > > to > > > > > /*/user-events > > > > > to get user events originating from all datacenters? > > > > > > > > > Yes, however they will have need a regex like > > > > /chicago-datacenter/user-events/* > > > > Yes > > > > > > > > > > > > > > (Assuming, for now, that these are all in the same cluster...) > > > > > > > > > > Also, just to confirm, it sounds from the proposal like config > > > overrides > > > > > would become fully hierarchical so you can override config at any > > > > directory > > > > > point. This will add complexity in implementation but I think will > > > likely > > > > > be much more operator friendly. > > > > > > > > > Yes, that is the idea. > > > > > > > > > > > > > > There are about a thousand details to discuss in terms of how this > > > would > > > > > impact the metadata request, various zk entries, and various other > > > > aspects, > > > > > but probably it makes sense to first agree on how we would want it > to > > > > work > > > > > and then start to dive into how to implement that. > > > > > > > > > Agreed. > > > > > > > > > > > > > > -Jay > > > > > > > > > > On Mon, Oct 12, 2015 at 5:28 PM, Ashish Singh <asi...@cloudera.com > > > > > > wrote: > > > > > > > > > > > Hey Jay, thanks for reviewing the proposal. Answers inline. > > > > > > > > > > > > On Mon, Oct 12, 2015 at 10:53 AM, Jay Kreps <j...@confluent.io> > > > wrote: > > > > > > > > > > > > > Hey guys, > > > > > > > > > > > > > > I think this is an important feature and one we've talked about > > > for a > > > > > > > while. I really think trying to invent a new nomenclature is > > going > > > to > > > > > > make > > > > > > > it hard for people to understand, though. As such I recommend > we > > > call > > > > > > > namespaces "directories" and denote them with '/'--this will > make > > > the > > > > > > > feature 1000x more understandable to people. > > > > > > > > > > > > Essentially you are suggesting two things here. > > > > > > 1. Use "Directory" instead of "Namespace" as it is more > intuitive. > > I > > > > > agree. > > > > > > 2. Make '/' as delimiter instead of ':'. Fine with me and I agree > > if > > > we > > > > > > call these directories, '/' is the way to go. > > > > > > > > > > > > I think we should inheret the > > > > > > > semantics of normal unix fs in so far as it makes sense. > > > > > > > > > > > > > > In this approach we get rid of topics entirely, instead we > really > > > > just > > > > > > have > > > > > > > partitions which are the equivalent of a file and retain their > > > > numeric > > > > > > > names, and the existing topic concept is just the first > directory > > > > level > > > > > > but > > > > > > > we generalize to allow arbitrarily many more levels of nesting. > > > This > > > > > > allows > > > > > > > categorization of data, such as > > > /datacenter1/user-events/page-views/3 > > > > > and > > > > > > > you can subscribe, apply configs or permissions at any level of > > the > > > > > > > hierarchy. > > > > > > > > > > > > > +1. This actually requires just a minor change to existing > > proposal, > > > > > i.e., > > > > > > "some:namespace:topic" becomes "some/namespace/topic". > > > > > > > > > > > > > > > > > > > > I'm actually not 100% such what the semantics of accessing data > > in > > > > > > > differing namespaces is in the current proposal, maybe you can > > > > clarify > > > > > > > Ashish? > > > > > > > > > > > > I will add more info to KIP on this, however I think a client > > should > > > be > > > > > > able to access data in any namespace as long as following > > conditions > > > > are > > > > > > satisfied. > > > > > > > > > > > > 1. Namespace, the client is trying to access, exists. > > > > > > 2. The client has sufficient permissions on the namespace for > type > > of > > > > > > operation the client is trying to perform on a topic within that > > > > > namespace. > > > > > > 3. The client has sufficient permissions on the topic for type of > > > > > operation > > > > > > the client is trying to perform on that topic. > > > > > > > > > > > > If we choose to go with what you suggested earlier that just have > > > > > hierarchy > > > > > > of directories, then step 3 will actually be covered in step 2. > > > > > > > > > > > > In the current proposal, consumers will subscribe to a topic in a > > > > > namespace > > > > > > by specifying <namespace>:<topic> as the topic name. They can > > > subscribe > > > > > to > > > > > > topics from multiple namespaces. > > > > > > > > > > > > Let me know if I totally missed your question. > > > > > > > > > > > > Since the point of Kafka is sharing data I think it is really > > > > > > > important that the grouping be just for > > > > > > convenience/permissions/config/etc > > > > > > > and that it remain possible to access multiple > > > directories/namespaces > > > > > > from > > > > > > > the same client. > > > > > > > > > > > > > Totally agree with you. > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > On Fri, Oct 9, 2015 at 6:32 PM, Ashish Singh < > > asi...@cloudera.com> > > > > > > wrote: > > > > > > > > > > > > > > > Hey Guys, > > > > > > > > > > > > > > > > I just created KIP-37 for adding namespaces to Kafka. > > > > > > > > > > > > > > > > KIP-37 > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-37+-+Add+Namespaces+to+Kafka > > > > > > > > > > > > > > > > > tracks the proposal. > > > > > > > > > > > > > > > > The idea is to make Kafka support multi-tenancy via > namespaces. > > > > > > > > > > > > > > > > Feedback and comments are welcome. > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Regards, > > > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Regards, > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Regards, > > > > Ashish > > > > > > > > > > > > > -- > > Regards, > Ashish >