Thanks Joel and Ismael. I just updated the KIP based on your feedback. KIP-33 has passed with +4 (binding) and +2 (non-binding)
Thanks everyone for the reading, feedback and voting! Jiangjie (Becket) Qin On Tue, Apr 19, 2016 at 5:25 PM, Ismael Juma <ism...@juma.me.uk> wrote: > Thanks Becket. I think it would be nice to update the KIP with regards to > point 3 and 4. > > In any case, +1 (non-binding) > > Ismael > > On Tue, Apr 19, 2016 at 2:03 AM, Becket Qin <becket....@gmail.com> wrote: > > > Thanks for the comments Ismael. Please see the replies inline. > > > > On Mon, Apr 18, 2016 at 6:50 AM, Ismael Juma <ism...@juma.me.uk> wrote: > > > > > Hi Jiangjie, > > > > > > Thanks for the KIP, it's a nice improvement. Since it seems like we > have > > > been using the voting thread for discussion, I'll do the same. > > > > > > A few minor comments/questions: > > > > > > 1. The proposed name for the time index file > > `SegmentBaseOffset.timeindex`. > > > Would `SegmentBaseOffset.time-index` be a little better? It would > clearly > > > separate the type of index in case we add additional index types in the > > > future. > > > > > I have no strong opinion on this, I am not adding any thing separator > > because it is more regex friendly. > > I am not sure about the other indexes, time and space seems to be two > most > > common dimensions. > > > > 2. When describing the time index entry, we say "Offset - the next offset > > > when the time index entry is inserted". I found the mention of `next` a > > bit > > > confusing as it looks to me like the time index entry has the first > > offset > > > in the message set. > > > > This semantic meaning is a little different from the offset index. The > > offset index information is self-contained by nature. i.e. all the > offsets > > before is smaller than the offset of this message set. So we only need to > > say "the offset of this message set is OFFSET". This does not quite apply > > to the time index because the max timestamp may or may not be in the > > message set being appended. So we have to either say, "the max timestamp > > before I append this message set is T", or "the max timestamp after I > > appended this message set is T". The former case means that we can skip > all > > the previous messages if we are looking for a timestamp > T and start > from > > this offset. The latter one means if we are searching for timestamp > T, > we > > should start after this message set, which is essentially the same as the > > former case but require an additional interpretation. > > > > 3. We say "The default initial / max size of the time index files is the > > > same as the offset index files. (time index entry is 1.5x of the size > of > > > offset index entry, user should set the configuration accordingly)". It > > may > > > be worth elaborating a little on what a user should do with regards to > > this > > > configuration when upgrading (ie maybe under "Compatibility, > Deprecation, > > > and Migration Plan"). > > > > > Makes sense. > > > > > > > 4. In a previous vote thread, Jun said "The simplest thing is probably > > > to change > > > the default index size to 2MB to match the default log segment size" > and > > > you seemed to agree. I couldn't find anything about this in the KIP. > Are > > we > > > still doing it? > > > > > Yes, we can still make the change for default settings. User might want > to > > set the index size a little larger if they have a customized size but in > > reality it should not cause problems other than rolling out a little more > > log segments. > > > > 5. We say "Instead, it is only monotonically increasing within each time > > > index file. i.e. It is possible that the time index file for a later > log > > > segment contains smaller timestamp than some timestamp in the time > index > > > file of an earlier segment.". I think it would be good to explain under > > > which scenario a time index file for a later log segment contains a > > smaller > > > timestamp (is this only when CreateTime is used?). > > > > > Yes, it only happens when CreateTime is used. > > > > > > > 6. We say "When searching by timestamp, broker will start from the > > earliest > > > log segment and check the last time index entry.". The existing logic > > > searches from newest segment backwards. Is there a reason why we are > > > changing it? > > > > > Suppose segment 0 has max timestamp 100, segment 1 has max timestamp 50 > and > > segment 3 has max timestamp 90, now user want to search for timestamp 80. > > If we search backwards, we have to take a look at all the segments. If we > > search forward, we will stop at the first segment whose max timestamp is > > greater than 80 (i.e all the previous segments has smaller timestamps) > and > > start the finer search on that segment. > > > > > > > 7. Do you mind if I fix typos and minor grammar issues directly in the > > > wiki? It seems easier than doing that via email. > > > > > Not at all, thanks for help. > > > > > > > Thanks, > > > Ismael > > > > > > On Thu, Apr 7, 2016 at 1:44 AM, Becket Qin <becket....@gmail.com> > wrote: > > > > > > > Hi all, > > > > > > > > I updated KIP-33 based on the initial implementation. Per discussion > on > > > > yesterday's KIP hangout, I would like to initiate the new vote thread > > for > > > > KIP-33. > > > > > > > > The KIP wiki: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index > > > > > > > > Here is a brief summary of the KIP: > > > > 1. We propose to add a time index for each log segment. > > > > 2. The time indices are going to be used of log retention, log > rolling > > > and > > > > message search by timestamp. > > > > > > > > There was an old voting thread which has some discussions on this > KIP. > > > The > > > > mail thread link is following: > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E > > > > > > > > I have the following WIP patch for reference. It needs a few more > unit > > > > tests and documentation. Other than that it should run fine. > > > > > > > > > > > > > > https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7 > > > > > > > > Thanks, > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > >