Hi Becket and everyone,

Could we please add the following functionality to this KIP. I think it
would be very useful for the broker to return the timestamp in the ack to
the producer (in response: timestamp per partition) and propagate it back
to client in RecordMetadata. This way, if timestamp type is LogAppendTime,
the producer client will see what timestamp was actually set -- and it
would match the timestamp that consumer sees. Also, returning the timestamp
in RecordMetadata is also useful for timestamp type = CreateTime, since
timestamp could be also set in KafkaProducer (if client set timestamp in
ProducerRecord to 0).

Since this requires protocol change as well, it will be better to implement
this as part of KIP-32, rather than proposing a new KIP.

Thanks,
Anna


On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy <jjkosh...@gmail.com> wrote:

> +1 from me
>
> Looking through this thread it seems there was some confusion on the
> migration discussion. This discussion in fact happened in the KIP-31
> discuss thread, not so much in the KIP hangout. There is considerable
> overlap in discussions between KIP-3[1,2,3] so it makes sense to
> cross-reference all of these.
>
> I'm finding the Apache list archive a little cumbersome to use (e.g., the
> current link in KIP-31 points to the beginning of September archives) but
> the emails discussing migration were in October:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
>
> Markmail has a better interface but interestingly it has not indexed any of
> the emails from August, September and early October (
>
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> ).
> Perhaps KIPs should include a permalink to the first message of the DISCUSS
> thread. E.g.,
>
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
>
> Also, just to clarify Jay's comments on the content of KIPs: I think having
> a pseudo-code spec/implementation guide is useful (especially for
> client-side KIPs). While the motivation should definitely capture “why we
> are doing the KIP” it probably shouldn’t have to exhaustively capture “why
> we are doing the KIP *this way*”. i.e., some of the discussions are
> extremely nuanced and in this case spans multiple KIPs so links to other
> KIPs and the discuss threads and KIP hangout recordings are perhaps
> sufficient to fill this gap - or maybe a new section that summarizes the
> discussions.
>
> Thanks,
>
> Joel
>
> On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Jiangjie,
> >
> > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > Unfortunately, we use MessageSet in SimpleConsumer, which is part of the
> > public api. Replacing MessageSet with o.a.k.common.record will be an
> > incompatible api change. So, we probably should do this after we
> deprecate
> > SimpleConsumer.
> >
> > My original question is actually whether we just bump up magic byte in
> > Message once to incorporate both the offset and the timestamp change. It
> > seems that the answer is yes. Could you reflect that in the KIP?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin <becket....@gmail.com> wrote:
> >
> > > Thanks a lot for the careful reading, Jun.
> > > Please see inline replies.
> > >
> > >
> > > > On Jan 6, 2016, at 3:24 AM, Jun Rao <j...@confluent.io> wrote:
> > > >
> > > > Jiangjie,
> > > >
> > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few
> minor
> > > > comments on the KIP.
> > > >
> > > > KIP-32:
> > > > 50. 6.c says "The log rolling has to depend on the earliest
> timestamp",
> > > > which is inconsistent with KIP-33.
> > > Corrected.
> > > >
> > > > 51. 8.b "If the time difference threshold is set to 0. The timestamp
> in
> > > the
> > > > message is equivalent to LogAppendTime." If the time difference is 0
> > and
> > > > CreateTime is used, all messages will likely be rejected in this
> > > proposal.
> > > > So, it's not equivalent to LogAppendTime.
> > > Corrected.
> > > >
> > > > 52. Could you include the new value of magic byte in message format
> > > change?
> > > > Also, do we have a single new message format that includes both the
> > > offset
> > > > change (relative offset for inner messages) and the addition of
> > > timestamp?
> > > I am actually thinking about this when I am writing the patch.
> > > The timestamp will be added to the o.a.k.common.record.Record and
> > > Kafka.message.Message. The offset change is in
> > > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet. To
> avoid
> > > unnecessary changes, my current patch did not merge them together but
> > > simply make sure the version of  Record(Message) and
> > > MemoryRecords(MessageSet) matches.
> > >
> > > Currently new clients uses classes in o.a.k.common.record, and the
> broker
> > > and old clients uses classes in kafka.message.
> > > I am thinking about doing the followings:
> > > 1. Migrate broker to use o.a.k.common.record.
> > > 2. Add message format V0 and V1 to o.a.k.common.protocol.Protocols.
> > > Ideally we should be able to define all the wire protocols in
> > > o.a.k.common.protocol.Protocol. So instead of having Record class to
> > parse
> > > byte arrays by itself, we can use Schema to parse the records.
> > >
> > > Would that be better?
> > > >
> > > > 53. Could you document the changes in ProducerRequest V2 and
> > FetchRequest
> > > > V2 (and the responses)?
> > > Done.
> > > >
> > > > 54. In migration phase 1, step 2, does internal ApiVersion mean
> > > > inter.broker.protocol.version?
> > > Yes.
> > > >
> > > > 55. In canary step 2.b, it says "It will only see
> > > > ProduceRequest/FetchRequest V1 from other brokers and clietns.". But
> in
> > > > phase 2, a broker will receive FetchRequest V2 from other brokers.
> > > I meant when we canary a broker in phase 2, there will be only one
> broker
> > > entering phase 2, the other brokers will remain at phase 1.
> > > >
> > > >
> > > > KIP-33:
> > > > 60. The KIP still says maintaining index at "at minute granularity"
> > even
> > > > though the index interval is configurable now.
> > > Corrected.
> > > >
> > > > 61. In this design, it's possible for a log segment to have an empty
> > time
> > > > index. In the worse case, we may have to scan more than the active
> > > segment
> > > > to recover the latest timestamp.
> > > Corrected.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Jan 4, 2016 at 11:37 AM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > >> Hey Becket/Anna -
> > > >>
> > > >> I have a few comments about the KIP.
> > > >>
> > > >> 1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and
> > > >> LogAppendTime etc..". This is actually the title of the now rejected
> > > Option
> > > >> 1.
> > > >> 2. (Minor) Can we rename the proposed option? It isn't really
> "option
> > 4"
> > > >> anymore.
> > > >> 3. I'm not clear on what exactly happens to compressed messages
> > > >> when message.timestamp.type=LogAppendTime? Does every batch get
> > > >> recompressed because the inner message gets rewritten with the
> server
> > > >> timestamp? Or does the message set on disk have the timestamp set to
> > > -1. In
> > > >> that case, what do we use as timestamp for the message?
> > > >> 4. Do message.timestamp.type and max.message.time.difference.ms
> need
> > > to be
> > > >> per-topic configs? It seems that this is really a client config
> i.e. a
> > > >> client is the source of timestamps not a topic. It could also be a
> > > >> broker-level config to keep things simple.
> > > >> 5. The "Proposed Changes" section in the KIP tries to build a
> > time-based
> > > >> index for query but that is a separate proposal (KIP-33). Can we
> more
> > > >> crisply identify what exactly will change when this KIP (and 31) is
> > > >> implemented? It isn't super clear to me at this point.
> > > >>
> > > >> Aside from that, I think the "Rejected Alternatives" section of the
> > KIP
> > > is
> > > >> excellent. Very good insight into what options were discussed and
> > > rejected.
> > > >>
> > > >> Aditya
> > > >>
> > > >>> On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin <becket....@gmail.com>
> > > wrote:
> > > >>>
> > > >>> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late
> reply
> > > >>> because I only have occasional gmail access from my phone...
> > > >>>
> > > >>> I just updated the wiki for KIP-32.
> > > >>>
> > > >>> Gwen,
> > > >>>
> > > >>> Yes, the migration plan is what you described.
> > > >>>
> > > >>> I agree with your comments on the version.
> > > >>> I changed message.format.version to use the release version.
> > > >>> I did not change the internal version, we can discuss this in a
> > > separate
> > > >>> thread.
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Jiangjie (Becket) Qin
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On Dec 24, 2015, at 5:38 AM, Guozhang Wang <wangg...@gmail.com>
> > > wrote:
> > > >>>>
> > > >>>> Also I agree with Gwen that such changes may worth a 0.10 release
> or
> > > >> even
> > > >>>> 1.0, having it in 0.9.1 would be quite confusing to users.
> > > >>>>
> > > >>>> Guozhang
> > > >>>>
> > > >>>>> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang <
> wangg...@gmail.com
> > >
> > > >>> wrote:
> > > >>>>>
> > > >>>>> Becket,
> > > >>>>>
> > > >>>>> Please let us know once you have updated the wiki page regarding
> > the
> > > >>>>> migration plan. Thanks!
> > > >>>>>
> > > >>>>> Guozhang
> > > >>>>>
> > > >>>>>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira <
> g...@confluent.io
> > >
> > > >>> wrote:
> > > >>>>>>
> > > >>>>>> Thanks Becket, Anne and Neha for responding to my concern.
> > > >>>>>>
> > > >>>>>> I had an offline discussion with Anne where she helped me
> > understand
> > > >>> the
> > > >>>>>> migration process. It isn't as bad as it looks in the KIP :)
> > > >>>>>>
> > > >>>>>> If I understand it correctly, the process (for users) will be:
> > > >>>>>>
> > > >>>>>> 1. Prepare for upgrade (set format.version = 0, ApiVersion =
> > 0.9.0)
> > > >>>>>> 2. Rolling upgrade of brokers
> > > >>>>>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between brokers
> > > will
> > > >>> use
> > > >>>>>> the new protocol
> > > >>>>>> 4. Start upgrading clients
> > > >>>>>> 5. When "enough" clients are upgraded, bump format.version to 1
> > > >>> (rolling).
> > > >>>>>>
> > > >>>>>> Becket, can you confirm?
> > > >>>>>>
> > > >>>>>> Assuming this is the process, I'm +1 on the change.
> > > >>>>>>
> > > >>>>>> Reminder to coders and reviewers that pull-requests with
> > user-facing
> > > >>>>>> changes should include documentation changes as well as code
> > > changes.
> > > >>>>>> And a polite request to try to be helpful to users on when to
> use
> > > >>>>>> create-time and when to use log-append-time as configuration -
> > this
> > > >> is
> > > >>> not
> > > >>>>>> a trivial decision.
> > > >>>>>>
> > > >>>>>> A separate point I'm going to raise in a different thread is
> that
> > we
> > > >>> need
> > > >>>>>> to streamline our versions a bit:
> > > >>>>>> 1. I'm afraid that 0.9.0-1 will be confusing to users who care
> > about
> > > >>>>>> released versions (what if we forget to change it before the
> > > release?
> > > >>> Is
> > > >>>>>> it
> > > >>>>>> meaningful enough to someone running off trunk?), we need to
> come
> > up
> > > >>> with
> > > >>>>>> something that will work for both LinkedIn and everyone else.
> > > >>>>>> 2. ApiVersion has real version numbers. message.format.version
> has
> > > >>>>>> sequence
> > > >>>>>> numbers. This makes us look pretty silly :)
> > > >>>>>>
> > > >>>>>> My version concerns can be addressed separately and should not
> > hold
> > > >>> back
> > > >>>>>> this KIP.
> > > >>>>>>
> > > >>>>>> Gwen
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin <
> > becket....@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi Anna,
> > > >>>>>>>
> > > >>>>>>> Thanks for initiating the voting process. I did not start the
> > > voting
> > > >>>>>>> process because there were still some ongoing discussion with
> Jun
> > > >>> about
> > > >>>>>> the
> > > >>>>>>> timestamp regarding compressed messages. That is why the wiki
> > page
> > > >>>>>> hasn't
> > > >>>>>>> reflected the latest conversation as Guozhang pointed out.
> > > >>>>>>>
> > > >>>>>>> Like Neha said I think we have reached general agreement on
> this
> > > >> KIP.
> > > >>> So
> > > >>>>>>> it is probably fine to start the KIP voting. At least we draw
> > more
> > > >>>>>>> attention to the KIP even if there are some new discussion to
> > bring
> > > >>> up.
> > > >>>>>>>
> > > >>>>>>> Regarding the upgrade plan, given we decided to implement
> KIP-31
> > > and
> > > >>>>>>> KIP-32 in the same patch to avoid change binary protocol twice,
> > the
> > > >>>>>> upgrade
> > > >>>>>>> plan was mostly discussed on the discussion thread of KIP-31.
> > > >>>>>>>
> > > >>>>>>> Since the voting has been initiated, I will update the wiki
> with
> > > >>> latest
> > > >>>>>>> conversation to avoid further confusion.
> > > >>>>>>>
> > > >>>>>>> BTW, I actually have started coding work on KIP-31 and KIP-32
> and
> > > >> will
> > > >>>>>>> focus on the patch before I return from vacation in mid Jan
> > because
> > > >> I
> > > >>>>>> have
> > > >>>>>>> no LInkedIn VPN access in China anyway...
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>>
> > > >>>>>>> Jiangjie
> > > >>>>>>>
> > > >>>>>>>> On Dec 23, 2015, at 12:31 PM, Anna Povzner <a...@confluent.io
> >
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> Hi Gwen,
> > > >>>>>>>>
> > > >>>>>>>> I just wanted to point out that I just started the vote.
> Becket
> > > >> wrote
> > > >>>>>> the
> > > >>>>>>>> proposal and led the discussions.
> > > >>>>>>>>
> > > >>>>>>>> What I understood from reading the discussion thread, the
> > > migration
> > > >>>>>> plan
> > > >>>>>>>> was discussed at the KIP meeting, and not much on the mailing
> > list
> > > >>>>>>> itself.
> > > >>>>>>>>
> > > >>>>>>>> My question about the migration plan was same as Guozhang
> wrote:
> > > >> The
> > > >>>>>> case
> > > >>>>>>>> when an upgraded broker receives an old producer request. The
> > > >>>>>> proposal is
> > > >>>>>>>> for the broker to fill in the timestamp field with the current
> > > time
> > > >>> at
> > > >>>>>>> the
> > > >>>>>>>> broker. Cons: it goes against the definition of CreateTime
> type
> > of
> > > >>> the
> > > >>>>>>>> timestamp (we are "over-writing" it at the broker). Pros: It
> > looks
> > > >>>>>> like
> > > >>>>>>>> most of the use-cases would actually want that behavior,
> because
> > > >>>>>>> otherwise
> > > >>>>>>>> timestamp is useless and they will need to support an old,
> > > >>>>>> pre-timestamp,
> > > >>>>>>>> behavior. E.g., if we modify log retention policy to use the
> > > >>>>>> timestamp,
> > > >>>>>>> we
> > > >>>>>>>> would need to support an old implementation (the one that does
> > not
> > > >>> use
> > > >>>>>>>> timestamps in the message). So I actually have a preference
> for
> > > the
> > > >>>>>>>> proposed approach.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Anna
> > > >>>>>>>>
> > > >>>>>>>>> On Tue, Dec 22, 2015 at 8:02 PM, Neha Narkhede <
> > > n...@confluent.io
> > > >>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Hey Gwen,
> > > >>>>>>>>>
> > > >>>>>>>>> Migration plan wasn't really discussed a ton in the previous
> > > >>> threads.
> > > >>>>>>> So it
> > > >>>>>>>>> will be great to dive deep and see if there are gaps there. I
> > had
> > > >>>>>> some
> > > >>>>>>>>> questions, but the details listed on the KIP are great.
> > > >>>>>>>>>
> > > >>>>>>>>> It is complex, though the plan outlined in the wiki assumes a
> > > zero
> > > >>>>>>> downtime
> > > >>>>>>>>> upgrade assuming that producers and consumers can't be
> upgraded
> > > in
> > > >>>>>>> tandem.
> > > >>>>>>>>> This is typical for companies that have a significant Kafka
> > > >>>>>> footprint,
> > > >>>>>>> like
> > > >>>>>>>>> LinkedIn.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> Neha
> > > >>>>>>>>>
> > > >>>>>>>>>> On Tue, Dec 22, 2015 at 7:48 PM, Gwen Shapira <
> > > g...@confluent.io
> > > >>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hi Anna,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks for the KIP, especially for the details on all the
> > > >>>>>> alternatives
> > > >>>>>>>>> and
> > > >>>>>>>>>> how we arrived at the proposal. Its really great!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Can you point me at where the migration plan was discussed?
> It
> > > >>> looks
> > > >>>>>>>>> overly
> > > >>>>>>>>>> complex and I have a bunch of questions, but if there was a
> > > >>>>>> discussion,
> > > >>>>>>>>> I'd
> > > >>>>>>>>>> like to read up rather than repeating it :)
> > > >>>>>>>>>>
> > > >>>>>>>>>> Gwen
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On Tue, Dec 22, 2015 at 12:34 PM, Anna Povzner <
> > > >> a...@confluent.io
> > > >>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I am opening the voting thread for KIP-32: Add CreateTime
> and
> > > >>>>>>>>>>> LogAppendTime to Kafka message.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> For reference, here's the KIP wiki:
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+CreateTime+and+LogAppendTime+to+Kafka+message
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> And the mailing list threads:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> September:
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm6NMg%3DPh4HAJdxr%3DpmZhfFcD5OEV2yxj3fg%2BXnEBTW%2B3w%40mail.gmail.com%3E
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> October:
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/%3CCAHrRUm7RiBAJxwO15s1tztz%3D15oibO-QJ%2B_w8AxafTnuw3jjCw%40mail.gmail.com%3E
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> December:
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201512.mbox/%3CCAHrRUm4ugxDYzyy26MGRGKpK4hsjT4EKTuu18M3wztYq4PE%3DaQ%40mail.gmail.com%3E
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>> Anna
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> Neha
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> -- Guozhang
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> -- Guozhang
> > > >>
> > >
> >
>

Reply via email to