Furthermore, when looking at messages produced to the data1_log topic with
print.timestamp=true they all have CreateTime:-1 whereas messages produced
to the beats topic have valid timestamps. The producers that are sending to
data1_log are older than the producers that are sending to beats - if this
is where the broker takes the timestamps from, it explains why they are all
empty for data1_log.

On 26 October 2017 at 08:07, Dan Markhasin <minimi...@gmail.com> wrote:

> After a bit more checking it seems that Kafka isn't writing timestamps at
> all in the .timeindex file for the topics where offset rewind is not
> working.
>
> The following output is from * a different 0.11.0.0 * which also has a
> topic called data1_log (this cluster has not experienced any issues lately):
>
> bash-4.2$ ls -ltr | tail
> -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 00000000000723010092.log
> -rw-rw-r-- 1 ibiuser users     926216 Oct 24 20:42
> 00000000000723010092.index
> -rw-rw-r-- 1 ibiuser users         16 Oct 25 05:09 leader-epoch-checkpoint
> -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 00000000000725186047.log
> -rw-rw-r-- 1 ibiuser users   10485756 Oct 25 09:52
> 00000000000727395414.timeindex
> -rw-rw-r-- 1 ibiuser users         10 Oct 25 09:52
> 00000000000727395414.snapshot
> -rw-rw-r-- 1 ibiuser users          0 Oct 25 09:52
> 00000000000725186047.timeindex
> -rw-rw-r-- 1 ibiuser users     961144 Oct 25 09:52
> 00000000000725186047.index
> -rw-rw-r-- 1 ibiuser users   10485760 Oct 25 21:59
> 00000000000727395414.index
> -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 00000000000727395414.log
>
>
> bash-4.2$ /kafka/latest/bin/kafka-run-class.sh
> kafka.tools.DumpLogSegments --files 00000000000727395414.timeindex >
> /tmp/timestamps 2>&1
>
> The output looks like this:
>
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
> timestamp: 0 offset: 727395414
>
>
> The offset is the same on all of the records and not a single line has a
> timestamp greater than zero.
>
> bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | grep
> -v 0
> bash-4.2$
>
> For comparison, the .timeindex for the beats topic on the same cluster
> looks like this:
>
> Dumping 00000000000254410745.timeindex
> timestamp: 1508978010544 offset: 254410759
> timestamp: 1508978011084 offset: 254410763
> timestamp: 1508978012080 offset: 254410770
> timestamp: 1508978012789 offset: 254410796
> timestamp: 1508978013981 offset: 254410808
> timestamp: 1508978014526 offset: 254410812
> timestamp: 1508978014698 offset: 254410828
> timestamp: 1508978014959 offset: 254410834
> timestamp: 1508978015677 offset: 254410854
> timestamp: 1508978016980 offset: 254410870
> timestamp: 1508978017212 offset: 254410883
>
>
> On 26 October 2017 at 07:26, Elyahou Ittah <elyaho...@fiverr.com> wrote:
>
>> Which driver is used yo produce these messages ?
>>
>> On Oct 26, 2017 07:11, "Dan Markhasin" <minimi...@gmail.com> wrote:
>>
>> > No, that flag doesn't affect which offsets are returned, only executes
>> the
>> > action (and resets the consumer to latest offset when used, regardless
>> of
>> > datetime value I provide).
>> >
>> > On 25 October 2017 at 23:44, Hans Jespersen <h...@confluent.io> wrote:
>> >
>> > > I think you are just missing the —execute flag.
>> > >
>> > > -hans
>> > >
>> > > > On Oct 25, 2017, at 1:24 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> > > >
>> > > > I wonder if you have hit KAFKA-5600.
>> > > >
>> > > > Is it possible that you try out 0.11.0.1 ?
>> > > >
>> > > > Thanks
>> > > >
>> > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin <
>> minimi...@gmail.com>
>> > > wrote:
>> > > >>
>> > > >> I am using 0.11.0.0.
>> > > >>
>> > > >> There is no difference configuration-wise - both have 10 partitions
>> > and
>> > > 2
>> > > >> replicas. There are no errors in the logs, but looking in the data
>> > > folder
>> > > >> it seems like Kafka is not updating the timeindex file for
>> data1_log -
>> > > >> notice how the timeindex file for the current log segment is not
>> being
>> > > >> updated.
>> > > >>
>> > > >> bash-4.2$ pwd
>> > > >> /kafka/data/data1_log-1
>> > > >> bash-4.2$ ls -ltr | tail
>> > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21
>> > 00000000000337554984.log
>> > > >> -rw-rw-r-- 1 ibiuser it     943616 Oct 25 01:21
>> > > 00000000000337554984.index
>> > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38
>> > 00000000000339816017.log
>> > > >> -rw-rw-r-- 1 ibiuser it   10485756 Oct 25 13:38
>> > > >> 00000000000341934289.timeindex
>> > > >> -rw-rw-r-- 1 ibiuser it         10 Oct 25 13:38
>> > > >> 00000000000341934289.snapshot
>> > > >> -rw-rw-r-- 1 ibiuser it          0 Oct 25 13:38
>> > > >> 00000000000339816017.timeindex
>> > > >> -rw-rw-r-- 1 ibiuser it     566712 Oct 25 13:38
>> > > 00000000000339816017.index
>> > > >> -rw-rw-r-- 1 ibiuser it         17 Oct 25 20:23
>> > leader-epoch-checkpoint
>> > > >> -rw-rw-r-- 1 ibiuser it   10485760 Oct 25 23:03
>> > > 00000000000341934289.index
>> > > >> -rw-rw-r-- 1 ibiuser it  461590419 Oct 25 23:04
>> > 00000000000341934289.log
>> > > >>
>> > > >> For comparison, the beats topic:
>> > > >>
>> > > >> bash-4.2$ cd ../beats-1
>> > > >> bash-4.2$ ls -ltr
>> > > >> total 3212088
>> > > >> -rw-rw-r-- 1 ibiuser it         17 Oct 25 00:23
>> > leader-epoch-checkpoint
>> > > >> -rw-rw-r-- 1 ibiuser it         10 Oct 25 20:04
>> > > >> 00000000000188672034.snapshot
>> > > >> -rw-rw-r-- 1 ibiuser it    2773008 Oct 25 20:04
>> > > >> 00000000000185224087.timeindex
>> > > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04
>> > 00000000000185224087.log
>> > > >> -rw-rw-r-- 1 ibiuser it    1967440 Oct 25 20:04
>> > > 00000000000185224087.index
>> > > >> -rw-rw-r-- 1 ibiuser it   10485760 Oct 25 23:03
>> > > 00000000000188672034.index
>> > > >> -rw-rw-r-- 1 ibiuser it   10485756 Oct 25 23:04
>> > > >> 00000000000188672034.timeindex
>> > > >> -rw-rw-r-- 1 ibiuser it   50166645 <5016%206645> Oct 25 23:04
>> > 00000000000188672034.log
>> > > >>
>> > > >>
>> > > >> To give some context to why I'm even trying to reset the offsets,
>> we
>> > had
>> > > >> encountered a strange situation earlier today:
>> > > >>
>> > > >> 1) One of the brokers had a hardware failure, and had to be rebuilt
>> > from
>> > > >> scratch (data partition was gone)
>> > > >> 2) When it went down, we noticed a spike in lag in one particular
>> > > consumer
>> > > >> group - it seems to have reset its offset to an earlier point in
>> time
>> > > (but
>> > > >> not the earliest offset of the topic); I have read other messages
>> on
>> > > this
>> > > >> mailing list of users who experienced the same behavior with
>> 0.11.0.0
>> > > >> 3) The broker was reinstalled and rejoined the cluster with the
>> same
>> > > >> broker.id (but with no data on it) - it rebalanced and eventually
>> all
>> > > >> replicas became synced and the cluster was functioning normally.
>> > > >> 4) I then decided to bounce the same broker again to see if I can
>> > > reproduce
>> > > >> the issue I saw in #2 - and as soon as the broker was restarted,
>> the
>> > > exact
>> > > >> same consumer group had its offset reset again and was lagging with
>> > > >> millions of records behind the current offset.
>> > > >> 5) I then tried to manually reset the consumer group's offset to a
>> few
>> > > >> minutes before I restarted the broker, only to discover this
>> strange
>> > > >> behavior where no matter which datetime value I provided, it kept
>> > > resetting
>> > > >> to the latest offset.
>> > > >>
>> > > >>
>> > > >>> On 25 October 2017 at 22:48, Ted Yu <yuzhih...@gmail.com> wrote:
>> > > >>>
>> > > >>> Do you mind providing a bit more information ?
>> > > >>>
>> > > >>> Release of Kafka you use
>> > > >>>
>> > > >>> Any difference between data1_log and the other, normal topic ?
>> > > >>>
>> > > >>> Probably check the broker log where data1_log is hosted - see if
>> > there
>> > > is
>> > > >>> some clue.
>> > > >>>
>> > > >>> Thanks
>> > > >>>
>> > > >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin <
>> minimi...@gmail.com
>> > >
>> > > >>> wrote:
>> > > >>>
>> > > >>>> I'm trying to use the kafka-consumer-groups.sh tool in order to
>> > rewind
>> > > >> a
>> > > >>>> consumer group's offset, however it seems to be returning the
>> latest
>> > > >>> offset
>> > > >>>> regarding of the requested offset.
>> > > >>>>
>> > > >>>> You can see in the below example that two consecutive commands to
>> > > reset
>> > > >>> the
>> > > >>>> offset to a specific point in time return different (increasing)
>> > > >> offsets,
>> > > >>>> which are actually the latest offsets for the topic.
>> > > >>>>
>> > > >>>> - The consumer group ("test_consumer") is a console consumer that
>> > was
>> > > >>>> started with --from-beginning and terminated after a few seconds,
>> > just
>> > > >>>> enough for it to commit its offsets.
>> > > >>>> - The topic data1_log is very busy with thousands of incoming
>> > messages
>> > > >>> per
>> > > >>>> second
>> > > >>>> - The datetime value provided is approx. 5 hours earlier than the
>> > > >> current
>> > > >>>> UTC time
>> > > >>>>
>> > > >>>> [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh
>> > > >>>> --bootstrap-server localhost:9092 --reset-offsets --group
>> > > test_consumer
>> > > >>>> --topic data1_log --to-datetime '2017-10-25T13:40:00.000'
>> > > >>>> Note: This will only show information about consumers that use
>> the
>> > > Java
>> > > >>>> consumer API (non-ZooKeeper-based consumers).
>> > > >>>>
>> > > >>>>
>> > > >>>> TOPIC                          PARTITION  NEW-OFFSET
>> > > >>>> data1_log                      8          301485420
>> > > >>>> data1_log                      1          342788637
>> > > >>>> data1_log                      7          287621428
>> > > >>>> data1_log                      3          268612266
>> > > >>>> data1_log                      0          201860717
>> > > >>>> data1_log                      9          202749553
>> > > >>>> data1_log                      4          188974032
>> > > >>>> data1_log                      6          234308481
>> > > >>>> data1_log                      2          263507741
>> > > >>>> data1_log                      5          232707238
>> > > >>>>
>> > > >>>> [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh
>> > > >>>> --bootstrap-server localhost:9092 --reset-offsets --group
>> > > test_consumer
>> > > >>>> --topic data1_log --to-datetime '2017-10-25T13:40:00.000'
>> > > >>>> Note: This will only show information about consumers that use
>> the
>> > > Java
>> > > >>>> consumer API (non-ZooKeeper-based consumers).
>> > > >>>>
>> > > >>>>
>> > > >>>> TOPIC                          PARTITION  NEW-OFFSET
>> > > >>>> data1_log                      8          301485491
>> > > >>>> data1_log                      1          342788779
>> > > >>>> data1_log                      7          287621534
>> > > >>>> data1_log                      3          268612364
>> > > >>>> data1_log                      0          201860796
>> > > >>>> data1_log                      9          202749620
>> > > >>>> data1_log                      4          188974068
>> > > >>>> data1_log                      6          234308564
>> > > >>>> data1_log                      2          263507823
>> > > >>>> data1_log                      5          232707293
>> > > >>>>
>> > > >>>> This issue seems to be topic-specific - there is a different
>> topic
>> > > >> (also
>> > > >>>> very active) where the same command consistently returns the
>> correct
>> > > >>>> offsets fixed in the time for the requested datetime.
>> > > >>>>
>> > > >>>> What could be the issue here?
>> > > >>>>
>> > > >>>> Thanks,
>> > > >>>> Dan
>> > > >>>>
>> > > >>>
>> > > >>
>> > >
>> >
>>
>
>

Reply via email to