Re: KIP-33 Opt out from Time Based indexing

2016-09-08 Thread Jan Filipiak
Hi Jun, thanks a lot for the hint, Ill check it out when I get a free minute! Best Jan On 07.09.2016 00:35, Jun Rao wrote: Jan, For the time rolling issue, Jiangjie has committed a fix ( https://issues.apache.org/jira/browse/KAFKA-4099) to trunk. Perhaps you can help test out trunk and see if

Re: KIP-33 Opt out from Time Based indexing

2016-09-06 Thread Jun Rao
Jan, For the time rolling issue, Jiangjie has committed a fix ( https://issues.apache.org/jira/browse/KAFKA-4099) to trunk. Perhaps you can help test out trunk and see if there are any other issues related to time-based index? Thanks, Jun On Mon, Sep 5, 2016 at 11:52 PM, Jan Filipiak wrote: >

Re: KIP-33 Opt out from Time Based indexing

2016-09-05 Thread Jan Filipiak
Hi Jun, sorry for the late reply. Regarding B, my main concern was just complexity of understanding what's going on. As you can see it took me probably some 2 days or so, to fully grab all the details in the implementation and what the impacts are. Usually I prefer to turn things I don't use of

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Becket Qin
Hi Jun, I just created KAFKA-4099 and will submit patch soon. Thanks, Jiangjie (Becket) Qin On Mon, Aug 29, 2016 at 11:55 AM, Jun Rao wrote: > Jiangjie, > > Good point on the time index format related to uncompressed messages. It > does seem that indexing based on file position requires a bit

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Jun Rao
Jiangjie, Good point on the time index format related to uncompressed messages. It does seem that indexing based on file position requires a bit more complexity. Since the time index is going to be used infrequently, having a level of indirection doesn't seem a big concern. So, we can leave the lo

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Jun Rao
Jan, For the usefulness of time index, it's ok if you don't plan to use it. However, I do think there are other people who will want to use it. Fixing an application bug always requires some additional work. Intuitively, being able to seek back to a particular point of time for replay is going to

Re: KIP-33 Opt out from Time Based indexing

2016-08-28 Thread Becket Qin
Jan, Thanks for the example of reprocessing the messages. I think in any case, reconsuming all the messages will definitely work. What we want to do here is to see if we can avoid doing that by only reconsuming necessary messages. In the scenario you mentioned, can you store an "offset-of-last-up

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Jan Filipiak
Hi Jun, thanks for taking the time to answer on such a detailed level. You are right Log.fetchOffsetByTimestamp works, the comment is just confusing "// Get all the segments whose largest timestamp is smaller than target timestamp" wich is apparently is not what takeWhile does (I am more on th

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Becket Qin
Jun, Good point about new log rolling behavior issue when move replicas. Keeping the old behavior sounds reasonable to me. Currently the time index entry points to the exact shallow message with the indexed timestamp, are you suggesting we change it to point to the starting offset of the appended

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Jun Rao
Jiangjie, I am not sure about changing the default to LogAppendTime since CreateTime is probably what most people want. It also doesn't solve the problem completely. For example, if you do partition reassignment and need to copy a bunch of old log segments to a new broker, this may cause log rolli

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Becket Qin
Hi Jan, It seems your main concern is for the changed behavior of time based log rolling and time based retention. That is actually why we have two timestamp types. If user set the log.message.timestamp.type to LogAppendTime, the broker will behave exactly the same as they were, except the rolling

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Jun Rao
Jan, Thanks a lot for the feedback. Now I understood your concern better. The following are my comments. The first odd thing that you pointed out could be a real concern. Basically, if a producer publishes messages with really old timestamp, our default log.roll.hours (7 days) will indeed cause t

Re: KIP-33 Opt out from Time Based indexing

2016-08-24 Thread Jan Filipiak
Hey Jun, I go and try again :), wrote the first one in quite a stressful environment. The bottom line is that I, for our use cases, see a to small use/effort ratio in this time index. We do not bootstrap new consumers for key-less logs so frequently and when we do it, they usually want everyth

Re: Re: KIP-33 Opt out from Time Based indexing

2016-08-24 Thread Jun Rao
Jan, Thanks for the reply. I actually wasn't sure what your main concern on time-based rolling is. Just a couple of clarifications. (1) Time-based rolling doesn't control how long a segment will be retained for. For retention, if you use time-based, it will now be based on the timestamp in the mes

Re: Re: KIP-33 Opt out from Time Based indexing

2016-08-24 Thread Jan Filipiak
Hi Jun, I copy pasted this mail from the archive, as I somehow didn't receive it per mail. I will sill make some comments in line, hopefully you can find them quick enough, my apologies. To make things more clear, you should also know, that all messages in our kafka setup have a common way to

Re: KIP-33 Opt out from Time Based indexing

2016-08-22 Thread Jun Rao
Jan, Currently, there is no switch to disable the time based index. There are quite a few use cases of time based index. 1. From KIP-33's wiki, it allows us to do time-based retention accurately. Before KIP-33, the time-based retention is based on the last modified time of each log segment. The

Re: KIP-33 Opt out from Time Based indexing

2016-08-22 Thread Jay Kreps
Can you describe the behavior you saw that you didn't like? -Jay On Mon, Aug 22, 2016 at 12:24 AM, Jan Filipiak wrote: > Hello everyone, > > I stumbled across KIP-33 and the time based index, while briefly checking > the wiki and commits, I fail to find a way to opt out. > I saw it having quite

KIP-33 Opt out from Time Based indexing

2016-08-22 Thread Jan Filipiak
Hello everyone, I stumbled across KIP-33 and the time based index, while briefly checking the wiki and commits, I fail to find a way to opt out. I saw it having quite some impact on when logs are rolled and was hoping not to have to deal with all of that. Is there a disable switch I overlooked