Hey Jay, I agree with your comments on the KIP page. As you said it is important to document what are the changes and why we made those decisions. I have added summaries to the page to help readers understand the decision we made more easily and also try to avoid being too verbose. I also clarified the scope of KIP-32 and KIP-33. Please let me know if you think we should make further improvements.
Thanks, Jiangjie (Becket) Qin > On Jan 6, 2016, at 7:21 AM, Jay Kreps <j...@confluent.io> wrote: > > I'm +1 on the design, but this proposal doesn't look like it is quite > finished. We are trying to make the KIPs really be a record of design > decisions we made and why. We should be able to give the document to > advanced Kafka users and have them walk away with a good idea of what > we are changing and why. I think having this kind of record of why > changes were made is really valuable especially given the amount of > times these things come back up (I'm quite sure this saves time in the > long run). > > If people think this is a waste of time, let's revisit, but if not > could we try to bring this up to the level of quality of the other > KIPs? > > In particular, a few things: > > Overall this reads more like implementation notes I might make for > myself rather than a design doc. > > This doesn't really motivate the idea of creation vs append time or > when each would be used or why you would want both. > > The public interfaces section says "There are a few options to achieve > the goals. Each has their own pros and cons. Please see the details > below." This might have been true in the first version as we were > discussing but now we have one proposal, right? We should list out the > new public interfaces (binary format, java api, configs). This is a > way of calling out that "the user gets this new stuff". > > This still has a number of details that I think we've now moved to > KIP-33. i.e. the bits about indexing are now part of that? > > Cheers, > > -Jay > >> On Tue, Jan 5, 2016 at 11:47 AM, Neha Narkhede <n...@confluent.io> wrote: >> +1 on the KIP. Thanks for the hard work, Becket! >> >>> On Tue, Jan 5, 2016 at 11:24 AM, Jun Rao <j...@confluent.io> wrote: >>> >>> Jiangjie, >>> >>> Thanks for the updated KIP. Overall, a +1 on the proposal. A few minor >>> comments on the KIP. >>> >>> KIP-32: >>> 50. 6.c says "The log rolling has to depend on the earliest timestamp", >>> which is inconsistent with KIP-33. >>> >>> 51. 8.b "If the time difference threshold is set to 0. The timestamp in the >>> message is equivalent to LogAppendTime." If the time difference is 0 and >>> CreateTime is used, all messages will likely be rejected in this proposal. >>> So, it's not equivalent to LogAppendTime. >>> >>> 52. Could you include the new value of magic byte in message format change? >>> Also, do we have a single new message format that includes both the offset >>> change (relative offset for inner messages) and the addition of timestamp? >>> >>> 53. Could you document the changes in ProducerRequest V2 and FetchRequest >>> V2 (and the responses)? >>> >>> 54. In migration phase 1, step 2, does internal ApiVersion mean >>> inter.broker.protocol.version? >>> >>> 55. In canary step 2.b, it says "It will only see >>> ProduceRequest/FetchRequest V1 from other brokers and clietns.". But in >>> phase 2, a broker will receive FetchRequest V2 from other brokers. >>> >>> >>> KIP-33: >>> 60. The KIP still says maintaining index at "at minute granularity" even >>> though the index interval is configurable now. >>> >>> 61. In this design, it's possible for a log segment to have an empty time >>> index. In the worse case, we may have to scan more than the active segment >>> to recover the latest timestamp. >>> >>> Thanks, >>> >>> Jun >>> >>> On Mon, Jan 4, 2016 at 11:37 AM, Aditya Auradkar < >>> aaurad...@linkedin.com.invalid> wrote: >>> >>>> Hey Becket/Anna - >>>> >>>> I have a few comments about the KIP. >>>> >>>> 1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and >>>> LogAppendTime etc..". This is actually the title of the now rejected >>> Option >>>> 1. >>>> 2. (Minor) Can we rename the proposed option? It isn't really "option 4" >>>> anymore. >>>> 3. I'm not clear on what exactly happens to compressed messages >>>> when message.timestamp.type=LogAppendTime? Does every batch get >>>> recompressed because the inner message gets rewritten with the server >>>> timestamp? Or does the message set on disk have the timestamp set to -1. >>> In >>>> that case, what do we use as timestamp for the message? >>>> 4. Do message.timestamp.type and max.message.time.difference.ms need to >>> be >>>> per-topic configs? It seems that this is really a client config i.e. a >>>> client is the source of timestamps not a topic. It could also be a >>>> broker-level config to keep things simple. >>>> 5. The "Proposed Changes" section in the KIP tries to build a time-based >>>> index for query but that is a separate proposal (KIP-33). Can we more >>>> crisply identify what exactly will change when this KIP (and 31) is >>>> implemented? It isn't super clear to me at this point. >>>> >>>> Aside from that, I think the "Rejected Alternatives" section of the KIP >>> is >>>> excellent. Very good insight into what options were discussed and >>> rejected. >>>> >>>> Aditya >>>> >>>>> On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin <becket....@gmail.com> >>>> wrote: >>>> >>>>> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late reply >>>>> because I only have occasional gmail access from my phone... >>>>> >>>>> I just updated the wiki for KIP-32. >>>>> >>>>> Gwen, >>>>> >>>>> Yes, the migration plan is what you described. >>>>> >>>>> I agree with your comments on the version. >>>>> I changed message.format.version to use the release version. >>>>> I did not change the internal version, we can discuss this in a >>> separate >>>>> thread. >>>>> >>>>> Thanks, >>>>> >>>>> Jiangjie (Becket) Qin >>>>> >>>>> >>>>> >>>>>> On Dec 24, 2015, at 5:38 AM, Guozhang Wang <wangg...@gmail.com> >>> wrote: >>>>>> >>>>>> Also I agree with Gwen that such changes may worth a 0.10 release or >>>> even >>>>>> 1.0, having it in 0.9.1 would be quite confusing to users. >>>>>> >>>>>> Guozhang >>>>>> >>>>>>> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang <wangg...@gmail.com> >>>>> wrote: >>>>>>> >>>>>>> Becket, >>>>>>> >>>>>>> Please let us know once you have updated the wiki page regarding the >>>>>>> migration plan. Thanks! >>>>>>> >>>>>>> Guozhang >>>>>>> >>>>>>>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira <g...@confluent.io> >>>>> wrote: >>>>>>>> >>>>>>>> Thanks Becket, Anne and Neha for responding to my concern. >>>>>>>> >>>>>>>> I had an offline discussion with Anne where she helped me >>> understand >>>>> the >>>>>>>> migration process. It isn't as bad as it looks in the KIP :) >>>>>>>> >>>>>>>> If I understand it correctly, the process (for users) will be: >>>>>>>> >>>>>>>> 1. Prepare for upgrade (set format.version = 0, ApiVersion = 0.9.0) >>>>>>>> 2. Rolling upgrade of brokers >>>>>>>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between brokers >>> will >>>>> use >>>>>>>> the new protocol >>>>>>>> 4. Start upgrading clients >>>>>>>> 5. When "enough" clients are upgraded, bump format.version to 1 >>>>> (rolling). >>>>>>>> >>>>>>>> Becket, can you confirm? >>>>>>>> >>>>>>>> Assuming this is the process, I'm +1 on the change. >>>>>>>> >>>>>>>> Reminder to coders and reviewers that pull-requests with >>> user-facing >>>>>>>> changes should include documentation changes as well as code >>> changes. >>>>>>>> And a polite request to try to be helpful to users on when to use >>>>>>>> create-time and when to use log-append-time as configuration - this >>>> is >>>>> not >>>>>>>> a trivial decision. >>>>>>>> >>>>>>>> A separate point I'm going to raise in a different thread is that >>> we >>>>> need >>>>>>>> to streamline our versions a bit: >>>>>>>> 1. I'm afraid that 0.9.0-1 will be confusing to users who care >>> about >>>>>>>> released versions (what if we forget to change it before the >>> release? >>>>> Is >>>>>>>> it >>>>>>>> meaningful enough to someone running off trunk?), we need to come >>> up >>>>> with >>>>>>>> something that will work for both LinkedIn and everyone else. >>>>>>>> 2. ApiVersion has real version numbers. message.format.version has >>>>>>>> sequence >>>>>>>> numbers. This makes us look pretty silly :) >>>>>>>> >>>>>>>> My version concerns can be addressed separately and should not hold >>>>> back >>>>>>>> this KIP. >>>>>>>> >>>>>>>> Gwen >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin <becket....@gmail.com >>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Anna, >>>>>>>>> >>>>>>>>> Thanks for initiating the voting process. I did not start the >>> voting >>>>>>>>> process because there were still some ongoing discussion with Jun >>>>> about >>>>>>>> the >>>>>>>>> timestamp regarding compressed messages. That is why the wiki page >>>>>>>> hasn't >>>>>>>>> reflected the latest conversation as Guozhang pointed out. >>>>>>>>> >>>>>>>>> Like Neha said I think we have reached general agreement on this >>>> KIP. >>>>> So >>>>>>>>> it is probably fine to start the KIP voting. At least we draw more >>>>>>>>> attention to the KIP even if there are some new discussion to >>> bring >>>>> up. >>>>>>>>> >>>>>>>>> Regarding the upgrade plan, given we decided to implement KIP-31 >>> and >>>>>>>>> KIP-32 in the same patch to avoid change binary protocol twice, >>> the >>>>>>>> upgrade >>>>>>>>> plan was mostly discussed on the discussion thread of KIP-31. >>>>>>>>> >>>>>>>>> Since the voting has been initiated, I will update the wiki with >>>>> latest >>>>>>>>> conversation to avoid further confusion. >>>>>>>>> >>>>>>>>> BTW, I actually have started coding work on KIP-31 and KIP-32 and >>>> will >>>>>>>>> focus on the patch before I return from vacation in mid Jan >>> because >>>> I >>>>>>>> have >>>>>>>>> no LInkedIn VPN access in China anyway... >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Jiangjie >>>>>>>>> >>>>>>>>>> On Dec 23, 2015, at 12:31 PM, Anna Povzner <a...@confluent.io> >>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Gwen, >>>>>>>>>> >>>>>>>>>> I just wanted to point out that I just started the vote. Becket >>>> wrote >>>>>>>> the >>>>>>>>>> proposal and led the discussions. >>>>>>>>>> >>>>>>>>>> What I understood from reading the discussion thread, the >>> migration >>>>>>>> plan >>>>>>>>>> was discussed at the KIP meeting, and not much on the mailing >>> list >>>>>>>>> itself. >>>>>>>>>> >>>>>>>>>> My question about the migration plan was same as Guozhang wrote: >>>> The >>>>>>>> case >>>>>>>>>> when an upgraded broker receives an old producer request. The >>>>>>>> proposal is >>>>>>>>>> for the broker to fill in the timestamp field with the current >>> time >>>>> at >>>>>>>>> the >>>>>>>>>> broker. Cons: it goes against the definition of CreateTime type >>> of >>>>> the >>>>>>>>>> timestamp (we are "over-writing" it at the broker). Pros: It >>> looks >>>>>>>> like >>>>>>>>>> most of the use-cases would actually want that behavior, because >>>>>>>>> otherwise >>>>>>>>>> timestamp is useless and they will need to support an old, >>>>>>>> pre-timestamp, >>>>>>>>>> behavior. E.g., if we modify log retention policy to use the >>>>>>>> timestamp, >>>>>>>>> we >>>>>>>>>> would need to support an old implementation (the one that does >>> not >>>>> use >>>>>>>>>> timestamps in the message). So I actually have a preference for >>> the >>>>>>>>>> proposed approach. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Anna >>>>>>>>>> >>>>>>>>>>> On Tue, Dec 22, 2015 at 8:02 PM, Neha Narkhede < >>> n...@confluent.io >>>>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hey Gwen, >>>>>>>>>>> >>>>>>>>>>> Migration plan wasn't really discussed a ton in the previous >>>>> threads. >>>>>>>>> So it >>>>>>>>>>> will be great to dive deep and see if there are gaps there. I >>> had >>>>>>>> some >>>>>>>>>>> questions, but the details listed on the KIP are great. >>>>>>>>>>> >>>>>>>>>>> It is complex, though the plan outlined in the wiki assumes a >>> zero >>>>>>>>> downtime >>>>>>>>>>> upgrade assuming that producers and consumers can't be upgraded >>> in >>>>>>>>> tandem. >>>>>>>>>>> This is typical for companies that have a significant Kafka >>>>>>>> footprint, >>>>>>>>> like >>>>>>>>>>> LinkedIn. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Neha >>>>>>>>>>> >>>>>>>>>>>> On Tue, Dec 22, 2015 at 7:48 PM, Gwen Shapira < >>> g...@confluent.io >>>>> >>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Anna, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the KIP, especially for the details on all the >>>>>>>> alternatives >>>>>>>>>>> and >>>>>>>>>>>> how we arrived at the proposal. Its really great! >>>>>>>>>>>> >>>>>>>>>>>> Can you point me at where the migration plan was discussed? It >>>>> looks >>>>>>>>>>> overly >>>>>>>>>>>> complex and I have a bunch of questions, but if there was a >>>>>>>> discussion, >>>>>>>>>>> I'd >>>>>>>>>>>> like to read up rather than repeating it :) >>>>>>>>>>>> >>>>>>>>>>>> Gwen >>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Dec 22, 2015 at 12:34 PM, Anna Povzner < >>>> a...@confluent.io >>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I am opening the voting thread for KIP-32: Add CreateTime and >>>>>>>>>>>>> LogAppendTime to Kafka message. >>>>>>>>>>>>> >>>>>>>>>>>>> For reference, here's the KIP wiki: >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+CreateTime+and+LogAppendTime+to+Kafka+message >>>>>>>>>>>>> >>>>>>>>>>>>> And the mailing list threads: >>>>>>>>>>>>> >>>>>>>>>>>>> September: >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm6NMg%3DPh4HAJdxr%3DpmZhfFcD5OEV2yxj3fg%2BXnEBTW%2B3w%40mail.gmail.com%3E >>>>>>>>>>>>> >>>>>>>>>>>>> October: >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/%3CCAHrRUm7RiBAJxwO15s1tztz%3D15oibO-QJ%2B_w8AxafTnuw3jjCw%40mail.gmail.com%3E >>>>>>>>>>>>> >>>>>>>>>>>>> December: >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201512.mbox/%3CCAHrRUm4ugxDYzyy26MGRGKpK4hsjT4EKTuu18M3wztYq4PE%3DaQ%40mail.gmail.com%3E >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Anna >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks, >>>>>>>>>>> Neha >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- Guozhang >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- Guozhang >> >> >> >> -- >> Thanks, >> Neha