[DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Konstantin Chukhlomin
Hi all, I have created a KIP to support negative timestamp: https://cwiki.apache.org/confluence/display/KAFKA/KIP-228+Negative+record+timestamp+support Here are proposed changes: https://github.com/ap

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-18 Thread Ewen Cheslack-Postava
I think the trivial change of just recognizing using -1 was a mistake for a sentinel value and special casing it while allowing other negative values through is the most practical, reasonable change. Realistically, the scope of impact for that -1 is pretty tiny, as has been pointed out. A single m

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-28 Thread Matthias J. Sax
I agree that changing message format or using a flag bit might not be worth it. However, just keeping -1 as "unknown" leaving a time gap give me a lot of headache, too. Your arguments about "not an issue in practice" kinda make sense to me, but I see the number of question on the mailing list alre

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-01-02 Thread Ewen Cheslack-Postava
For `allow.negative.timestamps`, do you mean this as a broker config? I'm not entirely clear on what the proposal would entail. I think taking into account whether we're talking about compatibility with existing data in Kafka vs enabling use of negative timestamps is important here. If they're eff

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-01-02 Thread Matthias J. Sax
I was thinking about a broker/topic config. However, I am not sure if we only need to worry about data written in the future (this would only be true, if there would be no records with -1 timestamp already). Assume that we you an existing topic that contains data with -1 = UNKNOWN records -- for t

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-01-02 Thread Ewen Cheslack-Postava
On Tue, Jan 2, 2018 at 8:04 PM, Matthias J. Sax wrote: > I was thinking about a broker/topic config. > > However, I am not sure if we only need to worry about data written in > the future (this would only be true, if there would be no records with > -1 timestamp already). Assume that we you an ex

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-01-16 Thread Konstantin Chukhlomin
Hi all, I tried to summarize below all approaches we were discussing. In case there are some difficulties in email formatting, see GoogleDocs: https://docs.google.com/document/d/1RjlcebpigOj9DyLCedyRxki9nZcFdkBchy-k7BiThMc/edit?usp=sharing

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-01-16 Thread Colin McCabe
I think we should just keep -1 as a special value, and allow negative timestamps. It just means one missing millisecond in 1969, right? There is just a lot of code everywhere checking for -1, and changing it now would be really likely to be buggy. We probably also want to limit the range of ti

Re: [DISCUSS] KIP-228 Negative record timestamp support

2018-02-07 Thread Guozhang Wang
TL.DR: I'd agree with Colin here. I admit that define -1 == unknown is not a good idea, and if we design the system from scratch we should not do so. But Colin's point is valid, that it has been treated like this in past versions, and I have seen producer client users doing so indeed. So changin

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Ted Yu
In KeepTimestampOnInvalidTimestamp, there should be check that timestamp is < 0. This would protect against future change to onInvalidTimestamp() callback. Wednesday, December 31, 1969 11:59:59 PM UTC was in the past. Can you enrich Motivation section on why the proposal is made (writing data gene

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Konstantin Chukhlomin
Hi Ted, Thank you for the response. I made a relevant changes to the KIP. > On Dec 5, 2017, at 11:59 AM, Ted Yu wrote: > > In KeepTimestampOnInvalidTimestamp, there should be check that timestamp is > < 0. > This would protect against future change to onInvalidTimestamp() callback. Not quite

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Ted Yu
In the diff you gave, onInvalidTimestamp() doesn't have any check. What if the timestamp corresponds to 5000 BC ? Is that still allowed ? Cheers On Tue, Dec 5, 2017 at 10:29 AM, Konstantin Chukhlomin wrote: > Hi Ted, > > Thank you for the response. > I made a relevant changes to the KIP. > > >

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Konstantin Chukhlomin
Yes, the point of that new class is to return same timestamp no mater if it's negative or not. And 5000 BC would be a valid timestamp. But I haven't tried to use streams with such historical data yet. > On Dec 5, 2017, at 3:02 PM, Ted Yu wrote: > > In the diff you gave, onInvalidTimestamp() doe

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Dong Lin
Hey Konstantin, Thanks for the KIP. I have a few questions below. Strictly speaking Kafka actually allows you to store historical data. And user are free to encode arbitrary timestamp field in their Kafka message. For example, your Kafka message can currently have Json or Avro format and you can

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Konstantin Chukhlomin
Hi Dong, Currently we are storing historical timestamp in the message. What we are trying to achieve is to make it possible to do Kafka lookup by timestamp. Ideally I would do `offsetsForTimes` to find articles published in 1910s (if we are storing articles on the log). So first two suggestion

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Matthias J. Sax
Thanks for the KIP Konstantin. From my understanding, you propose to just remove the negative timestamp check in KafkaProducer and KafkaStreams. If topics are configured with `CreateTime` brokers also write negative timestamps if they are embedded in the message. However, I am not sure about the

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Dong Lin
Hey Konstantin, According to KIP-32 the timestamp is also used for log rolling and log retention. Therefore, unless broker is configured to never delete any message based on time, messages produced with negative timestamp in your use-case will be deleted by the broker anyway. Do you actually plan

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Ted Yu
What if the negative timestamp is stored this way ? Long.MIN_VALUE + delta (where delta is positvie) and calculated this way when used: 1/1/1970 - delta This approach avoids the ambiguity of -1 timestamp since -1 would be stored as Long.MIN_VALUE+1 Log retention can handle such format with min

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Boerge Svingen
Yes. To provide a little more detail, we are using Kafka to store everything ever published by The New York Times, and to make this content available to a range of systems and applications. Assets are published to Kafka chronologically, so that consumers can seek to any point in time and start

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Dong Lin
Hey Boerge, Thanks for the blog link. I will read this blog later. Here is another alternative solution which may be worth thinking. We know that the Unix time 0 corresponds to January 1, 1970. Let's say the earliest time you may want to use as the timestamp of the Kafka message is within X milli

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Ted Yu
bq. you can add X to the timestamp before you produce Kafka message This assumes the earliest timestamp (for user application) is known beforehand. However, what if this earliest timestamp shifts even earlier (e.g. due to some discovery) ? Cheers On Tue, Dec 5, 2017 at 6:36 PM, Dong Lin wrote:

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Boerge Svingen
Thank you for the suggestion. We considered this before. It works, but it’s a hack, and we would be providing a bad user experience for our consumers if we had to explain, “if you want to start consuming in 2014, you have to pretend to want 2214”. We would rather solve the underlying problem.

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-05 Thread Dong Lin
Sounds good. I don't think there is concern with using Long.MIN_VALUE to indicate that timestamp is not available. As Matthias also mentioned, using Long.MIN_VALUE to indicate missing timestamp seems better than overloading -1 semantics. Do you want to update the "NO_TIMESTAMP (−1) problem" sessio

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-06 Thread Bill Bejeck
I'm getting to this a little late, but as for the missing timestamp semantics, it's a +1 from me for using Long.MIN_VALUE for missing timestamps for the reasons outlined by Matthias previously. Thanks, Bill On Wed, Dec 6, 2017 at 2:05 AM, Dong Lin wrote: > Sounds good. I don't think there is co

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-06 Thread Xavier Léauté
I agree with Matthias that keeping -1 special will be prone to errors. We should accept this is mistake resulting from lack of foresight on our part when adding timestamps in the first place and correct it. Using deltas will probably cause lots of headaches. It means we have to figure out the impl

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-06 Thread Ted Yu
bq. Using deltas will probably cause lots of headaches The downside of my initial proposal for using delta is that it didn't keep the natural ordering among negative timestamps. Here is variation of using delta : -1 is still kept as how it is treated today. The delta is calculated based on: 1/

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-07 Thread Konstantin Chukhlomin
Hi Matthias, Indeed for consumers it will be not obvious what −1 means: actual timestamp or no timestamp. Nevertheless, it's just −1 millisecond, so I thought it will be not a big deal to leave it (not clean, but acceptable). I agree that it will much cleaner to have a different type of topics

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-11 Thread Konstantin Chukhlomin
Hi all, I've updated KIP with few more details: Added (proposed) Changes in binary message format Added Changes from producer perspecti

Re: [DISCUSS] KIP-228 Negative record timestamp support

2017-12-12 Thread Dong Lin
Hey Konstantin, Thanks for updating the KIP. If we were to support negative timestamp in the message, we probably also want to support negative timestamp in ListOffsetRequest. Currently in ListOffsetRequest, timestamp value -2 is used to indicate earliest timestamp and timestamp value -1 is used