I did an experiment on searching messages using timestamps .. Step 1: Used Producer with Create Time ( CT ) Step 2 : Verify whether it reflects in Kafka or not 00000000000000000000.index 00000000000000000000.log 00000000000000000000.timeindex These three files in disk and seems to be time_index working .
Step 3: Let's look into data offset: 121 position: 149556 *CreateTime*: 1495718896912 isvalid: true payloadsize: 1194 magic: 1 compresscodec: NONE crc: 1053048980 keysize: 8 Looks good .. Step 4 : Check .timeindex file . timestamp: 1495718846912 offset: 116 timestamp: 1495718886912 offset: 120 timestamp: 1495718926912 offset: 124 timestamp: 1495718966912 offset: 128 So all set for Querying data using timestamp ? Kafka version : kafka_2.11-0.10.2.1 Here is the code i'm using to search query --> https://gist.github.com/senthilec566/bc8ed1dfcf493f0bb5c473c50854dff9 requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(queryTime, 1)); If i pass my own timestamp , always getting zero result .. *Same question asked here too **https://stackoverflow.com/questions/31917134/how-to-use-unix-timestamp-to-get-offset-using-simpleconsumer-api <https://stackoverflow.com/questions/31917134/how-to-use-unix-timestamp-to-get-offset-using-simpleconsumer-api>* . Also i could notice below error in index file: *Found timestamp mismatch* in :/home/user/kafka-logs/topic-0/00000000000000000000.timeindex Index timestamp: 0, log timestamp: 1495717686913 *Found out of order timestamp* in :/home/user/kafka-logs/topic-0/00000000000000000000.timeindex Index timestamp: 0, Previously indexed timestamp: 1495719406912 Not sure what is missing here :-( ... Pls advise me here! Cheers, Senthil On Thu, May 25, 2017 at 3:36 PM, SenthilKumar K <senthilec...@gmail.com> wrote: > Thanks a lot Mayuresh. I will look into SearchMessageByTimestamp feature > in Kafka .. > > Cheers, > Senthil > > On Thu, May 25, 2017 at 1:12 PM, Mayuresh Gharat < > gharatmayures...@gmail.com> wrote: > >> Hi Senthil, >> >> Kafka does allow search message by timestamp after KIP-33 : >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+ >> Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-S >> earchmessagebytimestamp >> >> The new consumer does provide you a way to get offsets by timestamp. You >> can use these offsets to seek to that offset and consume from there. So if >> you want to consume between a range you can get the start and end offset >> based on the timestamps, seek to the start offset and consume and process >> the data till you reach the end offset. >> >> But these timestamps are either CreateTime(when the message was created >> and you will have to specify this when you do the send()) or >> LogAppendTime(when the message was appended to the log on the kafka broker) >> : https://kafka.apache.org/0101/javadoc/org/apache/kafka/clien >> ts/producer/ProducerRecord.html >> >> Kafka does not look at the fields in your data (key/value) for giving >> back you the data. What I meant was it will not look at the timestamp >> specified by you in the actual data payload. >> >> Thanks, >> >> Mayuresh >> >> On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K <senthilec...@gmail.com> >> wrote: >> >>> Hello Dev Team, Pls let me know if any option to read data from Kafka >>> (all >>> partition ) using timestamp . Also can we set custom offset value to >>> messages ? >>> >>> Cheers, >>> Senthil >>> >>> On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K <senthilec...@gmail.com> >>> wrote: >>> >>> > Hi All , We have been using Kafka for our Use Case which helps in >>> > delivering real time raw logs.. I have a requirement to fetch data from >>> > Kafka by using offset .. >>> > >>> > DataSet Example : >>> > {"access_date":"2017-05-24 13:57:45.044","format":"json", >>> > "start":"1490296463.031"} >>> > {"access_date":"2017-05-24 13:57:46.044","format":"json", >>> > "start":"1490296463.031"} >>> > {"access_date":"2017-05-24 13:57:47.044","format":"json", >>> > "start":"1490296463.031"} >>> > {"access_date":"2017-05-24 13:58:02.042","format":"json", >>> > "start":"1490296463.031"} >>> > >>> > Above JSON data will be stored in Kafka.. >>> > >>> > Key --> acces_date in epoch format >>> > Value --> whole JSON. >>> > >>> > Data Access Pattern: >>> > 1) Get me last 2 minz data ? >>> > 2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24 >>> > 13:57:44:00 ? >>> > >>> > How to achieve this in Kafka ? >>> > >>> > I tried using SimpleConsumer , but it expects partition and not sure >>> > SimpleConsumer would match our requirement... >>> > >>> > Appreciate you help ! >>> > >>> > Cheers, >>> > Senthil >>> > >>> >> >> >> >> -- >> -Regards, >> Mayuresh R. Gharat >> (862) 250-7125 >> > >