Re: Failure to reset consumer offsets for specific topics
Not really.. there are dozens (over 50) producer instances using librdkafka, it is not feasible to update them all. I set log.message.timestamp.type to LogAppendTime in server.properties and it seems to be working (at least --reset-offsets returns the same offset for the same datetime for partitions where the leader is a broker with the new setting applied), but I won't be able to truly validate before I upgrade the entire cluster to 0.11.0.1 and apply this setting to all brokers. Is there a reason LogAppendTime is not the default setting for this? On 26 October 2017 at 11:42, Ted Yu wrote: > Can you update librdkafka and try again ? > Thanks > Original message From: Dan Markhasin > Date: 10/25/17 11:26 PM (GMT-08:00) To: users@kafka.apache.org Subject: > Re: Failure to reset consumer offsets for specific topics > For beats (the topic where timestamps are OK) the producers are Metricbeat > / Logstash which I assume use a relatively recent producer version. > For the topics with missing timestamps the producers are based on > librdkafka though I'm not sure which version exactly - I wouldn't be > surprised if it's pretty old and doesn't set timestamp on the messages it > produces. > > On 26 October 2017 at 08:17, Elyahou Ittah wrote: > > > Which driver is used yo produce these messages ? > > > > On Thu, Oct 26, 2017 at 8:14 AM, Dan Markhasin > > wrote: > > > > > Furthermore, when looking at messages produced to the data1_log topic > > with > > > print.timestamp=true they all have CreateTime:-1 whereas messages > > produced > > > to the beats topic have valid timestamps. The producers that are > sending > > to > > > data1_log are older than the producers that are sending to beats - if > > this > > > is where the broker takes the timestamps from, it explains why they are > > all > > > empty for data1_log. > > > > > > On 26 October 2017 at 08:07, Dan Markhasin > wrote: > > > > > > > After a bit more checking it seems that Kafka isn't writing > timestamps > > at > > > > all in the .timeindex file for the topics where offset rewind is not > > > > working. > > > > > > > > The following output is from * a different 0.11.0.0 * which also has > a > > > > topic called data1_log (this cluster has not experienced any issues > > > lately): > > > > > > > > bash-4.2$ ls -ltr | tail > > > > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 > > > 000723010092.log > > > > -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 > > > > 000723010092.index > > > > -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 > > > leader-epoch-checkpoint > > > > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 > > > 000725186047.log > > > > -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 > > > > 000727395414.timeindex > > > > -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 > > > > 000727395414.snapshot > > > > -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 > > > > 000725186047.timeindex > > > > -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 > > > > 000725186047.index > > > > -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 > > > > 000727395414.index > > > > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 > > > 000727395414.log > > > > > > > > > > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh > > > > kafka.tools.DumpLogSegments --files 000727395414.timeindex > > > > > /tmp/timestamps 2>&1 > > > > > > > > The output looks like this: > > > > > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > timestamp: 0 offset: 727395414 > > > > > > > > > > > > The offset is the same on all of the records and not a single line > has > > a > > > > timestamp greater than zero. > > > > > > > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | > > grep > > > > -v 0 > > > > bash-4.2$ > > > > > > > > For comparison, the .timeindex for the beats to
Re: Failure to reset consumer offsets for specific topics
Can you update librdkafka and try again ? Thanks Original message From: Dan Markhasin Date: 10/25/17 11:26 PM (GMT-08:00) To: users@kafka.apache.org Subject: Re: Failure to reset consumer offsets for specific topics For beats (the topic where timestamps are OK) the producers are Metricbeat / Logstash which I assume use a relatively recent producer version. For the topics with missing timestamps the producers are based on librdkafka though I'm not sure which version exactly - I wouldn't be surprised if it's pretty old and doesn't set timestamp on the messages it produces. On 26 October 2017 at 08:17, Elyahou Ittah wrote: > Which driver is used yo produce these messages ? > > On Thu, Oct 26, 2017 at 8:14 AM, Dan Markhasin > wrote: > > > Furthermore, when looking at messages produced to the data1_log topic > with > > print.timestamp=true they all have CreateTime:-1 whereas messages > produced > > to the beats topic have valid timestamps. The producers that are sending > to > > data1_log are older than the producers that are sending to beats - if > this > > is where the broker takes the timestamps from, it explains why they are > all > > empty for data1_log. > > > > On 26 October 2017 at 08:07, Dan Markhasin wrote: > > > > > After a bit more checking it seems that Kafka isn't writing timestamps > at > > > all in the .timeindex file for the topics where offset rewind is not > > > working. > > > > > > The following output is from * a different 0.11.0.0 * which also has a > > > topic called data1_log (this cluster has not experienced any issues > > lately): > > > > > > bash-4.2$ ls -ltr | tail > > > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 > > 000723010092.log > > > -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 > > > 000723010092.index > > > -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 > > leader-epoch-checkpoint > > > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 > > 000725186047.log > > > -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 > > > 000727395414.timeindex > > > -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 > > > 000727395414.snapshot > > > -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 > > > 000725186047.timeindex > > > -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 > > > 000725186047.index > > > -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 > > > 000727395414.index > > > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 > > 000727395414.log > > > > > > > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh > > > kafka.tools.DumpLogSegments --files 000727395414.timeindex > > > > /tmp/timestamps 2>&1 > > > > > > The output looks like this: > > > > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > > > > > > > The offset is the same on all of the records and not a single line has > a > > > timestamp greater than zero. > > > > > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | > grep > > > -v 0 > > > bash-4.2$ > > > > > > For comparison, the .timeindex for the beats topic on the same cluster > > > looks like this: > > > > > > Dumping 000254410745.timeindex > > > timestamp: 1508978010544 offset: 254410759 > > > timestamp: 1508978011084 offset: 254410763 > > > timestamp: 1508978012080 offset: 254410770 > > > timestamp: 1508978012789 offset: 254410796 > > > timestamp: 1508978013981 offset: 254410808 > > > timestamp: 1508978014526 offset: 254410812 > > > timestamp: 1508978014698 offset: 254410828 > > > timestamp: 1508978014959 offset: 254410834 > > > timestamp: 1508978015677 offset: 254410854 > > > timestamp: 1508978016980 offset: 254410870 > > > timestamp: 1508978017212 offset: 254410883 > > > > > > > > > On 26 October 2017 at 07:26, Elyahou Ittah > wrote: > > > > > >> Which driver is used yo produce these messages ? > > >> > > >> On Oct 26, 2017 07:11, "Dan Markhasin" wrote: > > >> > > >&
Re: Failure to reset consumer offsets for specific topics
For beats (the topic where timestamps are OK) the producers are Metricbeat / Logstash which I assume use a relatively recent producer version. For the topics with missing timestamps the producers are based on librdkafka though I'm not sure which version exactly - I wouldn't be surprised if it's pretty old and doesn't set timestamp on the messages it produces. On 26 October 2017 at 08:17, Elyahou Ittah wrote: > Which driver is used yo produce these messages ? > > On Thu, Oct 26, 2017 at 8:14 AM, Dan Markhasin > wrote: > > > Furthermore, when looking at messages produced to the data1_log topic > with > > print.timestamp=true they all have CreateTime:-1 whereas messages > produced > > to the beats topic have valid timestamps. The producers that are sending > to > > data1_log are older than the producers that are sending to beats - if > this > > is where the broker takes the timestamps from, it explains why they are > all > > empty for data1_log. > > > > On 26 October 2017 at 08:07, Dan Markhasin wrote: > > > > > After a bit more checking it seems that Kafka isn't writing timestamps > at > > > all in the .timeindex file for the topics where offset rewind is not > > > working. > > > > > > The following output is from * a different 0.11.0.0 * which also has a > > > topic called data1_log (this cluster has not experienced any issues > > lately): > > > > > > bash-4.2$ ls -ltr | tail > > > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 > > 000723010092.log > > > -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 > > > 000723010092.index > > > -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 > > leader-epoch-checkpoint > > > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 > > 000725186047.log > > > -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 > > > 000727395414.timeindex > > > -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 > > > 000727395414.snapshot > > > -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 > > > 000725186047.timeindex > > > -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 > > > 000725186047.index > > > -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 > > > 000727395414.index > > > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 > > 000727395414.log > > > > > > > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh > > > kafka.tools.DumpLogSegments --files 000727395414.timeindex > > > > /tmp/timestamps 2>&1 > > > > > > The output looks like this: > > > > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > timestamp: 0 offset: 727395414 > > > > > > > > > The offset is the same on all of the records and not a single line has > a > > > timestamp greater than zero. > > > > > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | > grep > > > -v 0 > > > bash-4.2$ > > > > > > For comparison, the .timeindex for the beats topic on the same cluster > > > looks like this: > > > > > > Dumping 000254410745.timeindex > > > timestamp: 1508978010544 offset: 254410759 > > > timestamp: 1508978011084 offset: 254410763 > > > timestamp: 1508978012080 offset: 254410770 > > > timestamp: 1508978012789 offset: 254410796 > > > timestamp: 1508978013981 offset: 254410808 > > > timestamp: 1508978014526 offset: 254410812 > > > timestamp: 1508978014698 offset: 254410828 > > > timestamp: 1508978014959 offset: 254410834 > > > timestamp: 1508978015677 offset: 254410854 > > > timestamp: 1508978016980 offset: 254410870 > > > timestamp: 1508978017212 offset: 254410883 > > > > > > > > > On 26 October 2017 at 07:26, Elyahou Ittah > wrote: > > > > > >> Which driver is used yo produce these messages ? > > >> > > >> On Oct 26, 2017 07:11, "Dan Markhasin" wrote: > > >> > > >> > No, that flag doesn't affect which offsets are returned, only > executes > > >> the > > >> > action (and resets the consumer to latest offset when used, > regardless > > >> of > > >> > datetime value I provide). > > >> > > > >> > On 25 October 2017 at 23:44, Hans Jespersen > > wrote: > > >> > > > >> > > I think you are just missing the —execute flag. > > >> > > > > >> > > -hans > > >> > > > > >> > > > On Oct 25, 2017, at 1:24 PM, Ted Yu > wrote: > > >> > > > > > >> > > > I wonder if you have hit KAFKA-5600. > > >> > > > > > >> > > > Is it possible that you try out 0.11.0.1 ? > > >> > > > > > >> > > > Thanks > > >> > > > > > >> > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin < > > >> minimi...@gmail.com> > > >> > > wrote: > > >> > > >> > > >> > > >> I am using 0.11.0.0. > > >> > > >> > > >> > > >> There is no difference configuration-wise - both have 10 > > partitions > > >> > and > > >> > > 2 > > >> > > >> replicas. There are no errors in the logs, but looking in the > > data > > >> > > folder > > >> > > >> it seems like Kafka is not updating the timeindex f
Re: Failure to reset consumer offsets for specific topics
Which driver is used yo produce these messages ? On Thu, Oct 26, 2017 at 8:14 AM, Dan Markhasin wrote: > Furthermore, when looking at messages produced to the data1_log topic with > print.timestamp=true they all have CreateTime:-1 whereas messages produced > to the beats topic have valid timestamps. The producers that are sending to > data1_log are older than the producers that are sending to beats - if this > is where the broker takes the timestamps from, it explains why they are all > empty for data1_log. > > On 26 October 2017 at 08:07, Dan Markhasin wrote: > > > After a bit more checking it seems that Kafka isn't writing timestamps at > > all in the .timeindex file for the topics where offset rewind is not > > working. > > > > The following output is from * a different 0.11.0.0 * which also has a > > topic called data1_log (this cluster has not experienced any issues > lately): > > > > bash-4.2$ ls -ltr | tail > > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 > 000723010092.log > > -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 > > 000723010092.index > > -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 > leader-epoch-checkpoint > > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 > 000725186047.log > > -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 > > 000727395414.timeindex > > -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 > > 000727395414.snapshot > > -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 > > 000725186047.timeindex > > -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 > > 000725186047.index > > -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 > > 000727395414.index > > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 > 000727395414.log > > > > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh > > kafka.tools.DumpLogSegments --files 000727395414.timeindex > > > /tmp/timestamps 2>&1 > > > > The output looks like this: > > > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > timestamp: 0 offset: 727395414 > > > > > > The offset is the same on all of the records and not a single line has a > > timestamp greater than zero. > > > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | grep > > -v 0 > > bash-4.2$ > > > > For comparison, the .timeindex for the beats topic on the same cluster > > looks like this: > > > > Dumping 000254410745.timeindex > > timestamp: 1508978010544 offset: 254410759 > > timestamp: 1508978011084 offset: 254410763 > > timestamp: 1508978012080 offset: 254410770 > > timestamp: 1508978012789 offset: 254410796 > > timestamp: 1508978013981 offset: 254410808 > > timestamp: 1508978014526 offset: 254410812 > > timestamp: 1508978014698 offset: 254410828 > > timestamp: 1508978014959 offset: 254410834 > > timestamp: 1508978015677 offset: 254410854 > > timestamp: 1508978016980 offset: 254410870 > > timestamp: 1508978017212 offset: 254410883 > > > > > > On 26 October 2017 at 07:26, Elyahou Ittah wrote: > > > >> Which driver is used yo produce these messages ? > >> > >> On Oct 26, 2017 07:11, "Dan Markhasin" wrote: > >> > >> > No, that flag doesn't affect which offsets are returned, only executes > >> the > >> > action (and resets the consumer to latest offset when used, regardless > >> of > >> > datetime value I provide). > >> > > >> > On 25 October 2017 at 23:44, Hans Jespersen > wrote: > >> > > >> > > I think you are just missing the —execute flag. > >> > > > >> > > -hans > >> > > > >> > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > >> > > > > >> > > > I wonder if you have hit KAFKA-5600. > >> > > > > >> > > > Is it possible that you try out 0.11.0.1 ? > >> > > > > >> > > > Thanks > >> > > > > >> > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin < > >> minimi...@gmail.com> > >> > > wrote: > >> > > >> > >> > > >> I am using 0.11.0.0. > >> > > >> > >> > > >> There is no difference configuration-wise - both have 10 > partitions > >> > and > >> > > 2 > >> > > >> replicas. There are no errors in the logs, but looking in the > data > >> > > folder > >> > > >> it seems like Kafka is not updating the timeindex file for > >> data1_log - > >> > > >> notice how the timeindex file for the current log segment is not > >> being > >> > > >> updated. > >> > > >> > >> > > >> bash-4.2$ pwd > >> > > >> /kafka/data/data1_log-1 > >> > > >> bash-4.2$ ls -ltr | tail > >> > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 > >> > 000337554984.log > >> > > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 > >> > > 000337554984.index > >> > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 > >> > 000339816017.log > >> > > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > >> > > >> 000341934289.timeindex > >> > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38
Re: Failure to reset consumer offsets for specific topics
Furthermore, when looking at messages produced to the data1_log topic with print.timestamp=true they all have CreateTime:-1 whereas messages produced to the beats topic have valid timestamps. The producers that are sending to data1_log are older than the producers that are sending to beats - if this is where the broker takes the timestamps from, it explains why they are all empty for data1_log. On 26 October 2017 at 08:07, Dan Markhasin wrote: > After a bit more checking it seems that Kafka isn't writing timestamps at > all in the .timeindex file for the topics where offset rewind is not > working. > > The following output is from * a different 0.11.0.0 * which also has a > topic called data1_log (this cluster has not experienced any issues lately): > > bash-4.2$ ls -ltr | tail > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 000723010092.log > -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 > 000723010092.index > -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 leader-epoch-checkpoint > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 000725186047.log > -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 > 000727395414.timeindex > -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 > 000727395414.snapshot > -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 > 000725186047.timeindex > -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 > 000725186047.index > -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 > 000727395414.index > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 000727395414.log > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh > kafka.tools.DumpLogSegments --files 000727395414.timeindex > > /tmp/timestamps 2>&1 > > The output looks like this: > > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > timestamp: 0 offset: 727395414 > > > The offset is the same on all of the records and not a single line has a > timestamp greater than zero. > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | grep > -v 0 > bash-4.2$ > > For comparison, the .timeindex for the beats topic on the same cluster > looks like this: > > Dumping 000254410745.timeindex > timestamp: 1508978010544 offset: 254410759 > timestamp: 1508978011084 offset: 254410763 > timestamp: 1508978012080 offset: 254410770 > timestamp: 1508978012789 offset: 254410796 > timestamp: 1508978013981 offset: 254410808 > timestamp: 1508978014526 offset: 254410812 > timestamp: 1508978014698 offset: 254410828 > timestamp: 1508978014959 offset: 254410834 > timestamp: 1508978015677 offset: 254410854 > timestamp: 1508978016980 offset: 254410870 > timestamp: 1508978017212 offset: 254410883 > > > On 26 October 2017 at 07:26, Elyahou Ittah wrote: > >> Which driver is used yo produce these messages ? >> >> On Oct 26, 2017 07:11, "Dan Markhasin" wrote: >> >> > No, that flag doesn't affect which offsets are returned, only executes >> the >> > action (and resets the consumer to latest offset when used, regardless >> of >> > datetime value I provide). >> > >> > On 25 October 2017 at 23:44, Hans Jespersen wrote: >> > >> > > I think you are just missing the —execute flag. >> > > >> > > -hans >> > > >> > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: >> > > > >> > > > I wonder if you have hit KAFKA-5600. >> > > > >> > > > Is it possible that you try out 0.11.0.1 ? >> > > > >> > > > Thanks >> > > > >> > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin < >> minimi...@gmail.com> >> > > wrote: >> > > >> >> > > >> I am using 0.11.0.0. >> > > >> >> > > >> There is no difference configuration-wise - both have 10 partitions >> > and >> > > 2 >> > > >> replicas. There are no errors in the logs, but looking in the data >> > > folder >> > > >> it seems like Kafka is not updating the timeindex file for >> data1_log - >> > > >> notice how the timeindex file for the current log segment is not >> being >> > > >> updated. >> > > >> >> > > >> bash-4.2$ pwd >> > > >> /kafka/data/data1_log-1 >> > > >> bash-4.2$ ls -ltr | tail >> > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 >> > 000337554984.log >> > > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 >> > > 000337554984.index >> > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 >> > 000339816017.log >> > > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 >> > > >> 000341934289.timeindex >> > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 >> > > >> 000341934289.snapshot >> > > >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 >> > > >> 000339816017.timeindex >> > > >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 >> > > 000339816017.index >> > > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 >> > leader-epoch-checkpoint >> > > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 >
Re: Failure to reset consumer offsets for specific topics
After a bit more checking it seems that Kafka isn't writing timestamps at all in the .timeindex file for the topics where offset rewind is not working. The following output is from * a different 0.11.0.0 * which also has a topic called data1_log (this cluster has not experienced any issues lately): bash-4.2$ ls -ltr | tail -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42 000723010092.log -rw-rw-r-- 1 ibiuser users 926216 Oct 24 20:42 000723010092.index -rw-rw-r-- 1 ibiuser users 16 Oct 25 05:09 leader-epoch-checkpoint -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52 000725186047.log -rw-rw-r-- 1 ibiuser users 10485756 Oct 25 09:52 000727395414.timeindex -rw-rw-r-- 1 ibiuser users 10 Oct 25 09:52 000727395414.snapshot -rw-rw-r-- 1 ibiuser users 0 Oct 25 09:52 000725186047.timeindex -rw-rw-r-- 1 ibiuser users 961144 Oct 25 09:52 000725186047.index -rw-rw-r-- 1 ibiuser users 10485760 Oct 25 21:59 000727395414.index -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59 000727395414.log bash-4.2$ /kafka/latest/bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files 000727395414.timeindex > /tmp/timestamps 2>&1 The output looks like this: timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 timestamp: 0 offset: 727395414 The offset is the same on all of the records and not a single line has a timestamp greater than zero. bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' | grep -v 0 bash-4.2$ For comparison, the .timeindex for the beats topic on the same cluster looks like this: Dumping 000254410745.timeindex timestamp: 1508978010544 offset: 254410759 timestamp: 1508978011084 offset: 254410763 timestamp: 1508978012080 offset: 254410770 timestamp: 1508978012789 offset: 254410796 timestamp: 1508978013981 offset: 254410808 timestamp: 1508978014526 offset: 254410812 timestamp: 1508978014698 offset: 254410828 timestamp: 1508978014959 offset: 254410834 timestamp: 1508978015677 offset: 254410854 timestamp: 1508978016980 offset: 254410870 timestamp: 1508978017212 offset: 254410883 On 26 October 2017 at 07:26, Elyahou Ittah wrote: > Which driver is used yo produce these messages ? > > On Oct 26, 2017 07:11, "Dan Markhasin" wrote: > > > No, that flag doesn't affect which offsets are returned, only executes > the > > action (and resets the consumer to latest offset when used, regardless of > > datetime value I provide). > > > > On 25 October 2017 at 23:44, Hans Jespersen wrote: > > > > > I think you are just missing the —execute flag. > > > > > > -hans > > > > > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > > > > > > > > I wonder if you have hit KAFKA-5600. > > > > > > > > Is it possible that you try out 0.11.0.1 ? > > > > > > > > Thanks > > > > > > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin > > > > wrote: > > > >> > > > >> I am using 0.11.0.0. > > > >> > > > >> There is no difference configuration-wise - both have 10 partitions > > and > > > 2 > > > >> replicas. There are no errors in the logs, but looking in the data > > > folder > > > >> it seems like Kafka is not updating the timeindex file for > data1_log - > > > >> notice how the timeindex file for the current log segment is not > being > > > >> updated. > > > >> > > > >> bash-4.2$ pwd > > > >> /kafka/data/data1_log-1 > > > >> bash-4.2$ ls -ltr | tail > > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 > > 000337554984.log > > > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 > > > 000337554984.index > > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 > > 000339816017.log > > > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > > > >> 000341934289.timeindex > > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 > > > >> 000341934289.snapshot > > > >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 > > > >> 000339816017.timeindex > > > >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 > > > 000339816017.index > > > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 > > leader-epoch-checkpoint > > > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > > > 000341934289.index > > > >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 > > 000341934289.log > > > >> > > > >> For comparison, the beats topic: > > > >> > > > >> bash-4.2$ cd ../beats-1 > > > >> bash-4.2$ ls -ltr > > > >> total 3212088 > > > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 > > leader-epoch-checkpoint > > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 > > > >> 000188672034.snapshot > > > >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 > > > >> 000185224087.timeindex > > > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 > > 000185224087.log > > > >> -rw-rw-r-- 1 ibiuser i
Re: Failure to reset consumer offsets for specific topics
Any idea why I started receiving mails from this list as of 2:43 am today? I didn't make any changes or subscribe to anything. I even clicked unsubscribe earlier today and am still receiving mails. Maybe there's a misconfiguration in your email list? Kelly On Wed, Oct 25, 2017 at 1:44 PM, Hans Jespersen wrote: > I think you are just missing the —execute flag. > > -hans > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > > > > I wonder if you have hit KAFKA-5600. > > > > Is it possible that you try out 0.11.0.1 ? > > > > Thanks > > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin > wrote: > >> > >> I am using 0.11.0.0. > >> > >> There is no difference configuration-wise - both have 10 partitions and > 2 > >> replicas. There are no errors in the logs, but looking in the data > folder > >> it seems like Kafka is not updating the timeindex file for data1_log - > >> notice how the timeindex file for the current log segment is not being > >> updated. > >> > >> bash-4.2$ pwd > >> /kafka/data/data1_log-1 > >> bash-4.2$ ls -ltr | tail > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 > 000337554984.index > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > >> 000341934289.timeindex > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 > >> 000341934289.snapshot > >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 > >> 000339816017.timeindex > >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 > 000339816017.index > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > 000341934289.index > >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log > >> > >> For comparison, the beats topic: > >> > >> bash-4.2$ cd ../beats-1 > >> bash-4.2$ ls -ltr > >> total 3212088 > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 > >> 000188672034.snapshot > >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 > >> 000185224087.timeindex > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log > >> -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 > 000185224087.index > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > 000188672034.index > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 > >> 000188672034.timeindex > >> -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log > >> > >> > >> To give some context to why I'm even trying to reset the offsets, we had > >> encountered a strange situation earlier today: > >> > >> 1) One of the brokers had a hardware failure, and had to be rebuilt from > >> scratch (data partition was gone) > >> 2) When it went down, we noticed a spike in lag in one particular > consumer > >> group - it seems to have reset its offset to an earlier point in time > (but > >> not the earliest offset of the topic); I have read other messages on > this > >> mailing list of users who experienced the same behavior with 0.11.0.0 > >> 3) The broker was reinstalled and rejoined the cluster with the same > >> broker.id (but with no data on it) - it rebalanced and eventually all > >> replicas became synced and the cluster was functioning normally. > >> 4) I then decided to bounce the same broker again to see if I can > reproduce > >> the issue I saw in #2 - and as soon as the broker was restarted, the > exact > >> same consumer group had its offset reset again and was lagging with > >> millions of records behind the current offset. > >> 5) I then tried to manually reset the consumer group's offset to a few > >> minutes before I restarted the broker, only to discover this strange > >> behavior where no matter which datetime value I provided, it kept > resetting > >> to the latest offset. > >> > >> > >>> On 25 October 2017 at 22:48, Ted Yu wrote: > >>> > >>> Do you mind providing a bit more information ? > >>> > >>> Release of Kafka you use > >>> > >>> Any difference between data1_log and the other, normal topic ? > >>> > >>> Probably check the broker log where data1_log is hosted - see if there > is > >>> some clue. > >>> > >>> Thanks > >>> > >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin > >>> wrote: > >>> > I'm trying to use the kafka-consumer-groups.sh tool in order to rewind > >> a > consumer group's offset, however it seems to be returning the latest > >>> offset > regarding of the requested offset. > > You can see in the below example that two consecutive commands to > reset > >>> the > offset to a specific point in time return different (increasing) > >> offsets, > which are actually the latest offsets for the topic. > > - The consumer group ("test_consumer") is a console consumer that was > started
Re: Failure to reset consumer offsets for specific topics
Which driver is used yo produce these messages ? On Oct 26, 2017 07:11, "Dan Markhasin" wrote: > No, that flag doesn't affect which offsets are returned, only executes the > action (and resets the consumer to latest offset when used, regardless of > datetime value I provide). > > On 25 October 2017 at 23:44, Hans Jespersen wrote: > > > I think you are just missing the —execute flag. > > > > -hans > > > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > > > > > > I wonder if you have hit KAFKA-5600. > > > > > > Is it possible that you try out 0.11.0.1 ? > > > > > > Thanks > > > > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin > > wrote: > > >> > > >> I am using 0.11.0.0. > > >> > > >> There is no difference configuration-wise - both have 10 partitions > and > > 2 > > >> replicas. There are no errors in the logs, but looking in the data > > folder > > >> it seems like Kafka is not updating the timeindex file for data1_log - > > >> notice how the timeindex file for the current log segment is not being > > >> updated. > > >> > > >> bash-4.2$ pwd > > >> /kafka/data/data1_log-1 > > >> bash-4.2$ ls -ltr | tail > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 > 000337554984.log > > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 > > 000337554984.index > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 > 000339816017.log > > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > > >> 000341934289.timeindex > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 > > >> 000341934289.snapshot > > >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 > > >> 000339816017.timeindex > > >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 > > 000339816017.index > > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 > leader-epoch-checkpoint > > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > > 000341934289.index > > >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 > 000341934289.log > > >> > > >> For comparison, the beats topic: > > >> > > >> bash-4.2$ cd ../beats-1 > > >> bash-4.2$ ls -ltr > > >> total 3212088 > > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 > leader-epoch-checkpoint > > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 > > >> 000188672034.snapshot > > >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 > > >> 000185224087.timeindex > > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 > 000185224087.log > > >> -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 > > 000185224087.index > > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > > 000188672034.index > > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 > > >> 000188672034.timeindex > > >> -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 > 000188672034.log > > >> > > >> > > >> To give some context to why I'm even trying to reset the offsets, we > had > > >> encountered a strange situation earlier today: > > >> > > >> 1) One of the brokers had a hardware failure, and had to be rebuilt > from > > >> scratch (data partition was gone) > > >> 2) When it went down, we noticed a spike in lag in one particular > > consumer > > >> group - it seems to have reset its offset to an earlier point in time > > (but > > >> not the earliest offset of the topic); I have read other messages on > > this > > >> mailing list of users who experienced the same behavior with 0.11.0.0 > > >> 3) The broker was reinstalled and rejoined the cluster with the same > > >> broker.id (but with no data on it) - it rebalanced and eventually all > > >> replicas became synced and the cluster was functioning normally. > > >> 4) I then decided to bounce the same broker again to see if I can > > reproduce > > >> the issue I saw in #2 - and as soon as the broker was restarted, the > > exact > > >> same consumer group had its offset reset again and was lagging with > > >> millions of records behind the current offset. > > >> 5) I then tried to manually reset the consumer group's offset to a few > > >> minutes before I restarted the broker, only to discover this strange > > >> behavior where no matter which datetime value I provided, it kept > > resetting > > >> to the latest offset. > > >> > > >> > > >>> On 25 October 2017 at 22:48, Ted Yu wrote: > > >>> > > >>> Do you mind providing a bit more information ? > > >>> > > >>> Release of Kafka you use > > >>> > > >>> Any difference between data1_log and the other, normal topic ? > > >>> > > >>> Probably check the broker log where data1_log is hosted - see if > there > > is > > >>> some clue. > > >>> > > >>> Thanks > > >>> > > >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin > > > >>> wrote: > > >>> > > I'm trying to use the kafka-consumer-groups.sh tool in order to > rewind > > >> a > > consumer group's offset, however it seems to be returning the latest > > >>> offset > > regarding of the requested offset. > > > > You can see in the below example that two cons
Re: Failure to reset consumer offsets for specific topics
No, that flag doesn't affect which offsets are returned, only executes the action (and resets the consumer to latest offset when used, regardless of datetime value I provide). On 25 October 2017 at 23:44, Hans Jespersen wrote: > I think you are just missing the —execute flag. > > -hans > > > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > > > > I wonder if you have hit KAFKA-5600. > > > > Is it possible that you try out 0.11.0.1 ? > > > > Thanks > > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin > wrote: > >> > >> I am using 0.11.0.0. > >> > >> There is no difference configuration-wise - both have 10 partitions and > 2 > >> replicas. There are no errors in the logs, but looking in the data > folder > >> it seems like Kafka is not updating the timeindex file for data1_log - > >> notice how the timeindex file for the current log segment is not being > >> updated. > >> > >> bash-4.2$ pwd > >> /kafka/data/data1_log-1 > >> bash-4.2$ ls -ltr | tail > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log > >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 > 000337554984.index > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > >> 000341934289.timeindex > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 > >> 000341934289.snapshot > >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 > >> 000339816017.timeindex > >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 > 000339816017.index > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > 000341934289.index > >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log > >> > >> For comparison, the beats topic: > >> > >> bash-4.2$ cd ../beats-1 > >> bash-4.2$ ls -ltr > >> total 3212088 > >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint > >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 > >> 000188672034.snapshot > >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 > >> 000185224087.timeindex > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log > >> -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 > 000185224087.index > >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 > 000188672034.index > >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 > >> 000188672034.timeindex > >> -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log > >> > >> > >> To give some context to why I'm even trying to reset the offsets, we had > >> encountered a strange situation earlier today: > >> > >> 1) One of the brokers had a hardware failure, and had to be rebuilt from > >> scratch (data partition was gone) > >> 2) When it went down, we noticed a spike in lag in one particular > consumer > >> group - it seems to have reset its offset to an earlier point in time > (but > >> not the earliest offset of the topic); I have read other messages on > this > >> mailing list of users who experienced the same behavior with 0.11.0.0 > >> 3) The broker was reinstalled and rejoined the cluster with the same > >> broker.id (but with no data on it) - it rebalanced and eventually all > >> replicas became synced and the cluster was functioning normally. > >> 4) I then decided to bounce the same broker again to see if I can > reproduce > >> the issue I saw in #2 - and as soon as the broker was restarted, the > exact > >> same consumer group had its offset reset again and was lagging with > >> millions of records behind the current offset. > >> 5) I then tried to manually reset the consumer group's offset to a few > >> minutes before I restarted the broker, only to discover this strange > >> behavior where no matter which datetime value I provided, it kept > resetting > >> to the latest offset. > >> > >> > >>> On 25 October 2017 at 22:48, Ted Yu wrote: > >>> > >>> Do you mind providing a bit more information ? > >>> > >>> Release of Kafka you use > >>> > >>> Any difference between data1_log and the other, normal topic ? > >>> > >>> Probably check the broker log where data1_log is hosted - see if there > is > >>> some clue. > >>> > >>> Thanks > >>> > >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin > >>> wrote: > >>> > I'm trying to use the kafka-consumer-groups.sh tool in order to rewind > >> a > consumer group's offset, however it seems to be returning the latest > >>> offset > regarding of the requested offset. > > You can see in the below example that two consecutive commands to > reset > >>> the > offset to a specific point in time return different (increasing) > >> offsets, > which are actually the latest offsets for the topic. > > - The consumer group ("test_consumer") is a console consumer that was > started with --from-beginning and terminated after a few seconds, just > enough for it t
Re: Failure to reset consumer offsets for specific topics
Clarification: my most recent reply was w.r.t. the strange situation Dan described, not the offset resetting. On Wed, Oct 25, 2017 at 1:24 PM, Ted Yu wrote: > I wonder if you have hit KAFKA-5600. > > Is it possible that you try out 0.11.0.1 ? > > Thanks > > On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin > wrote: > >> I am using 0.11.0.0. >> >> There is no difference configuration-wise - both have 10 partitions and 2 >> replicas. There are no errors in the logs, but looking in the data folder >> it seems like Kafka is not updating the timeindex file for data1_log - >> notice how the timeindex file for the current log segment is not being >> updated. >> >> bash-4.2$ pwd >> /kafka/data/data1_log-1 >> bash-4.2$ ls -ltr | tail >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 000337554984.index >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 >> 000341934289.timeindex >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 >> 000341934289.snapshot >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 >> 000339816017.timeindex >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 000339816017.index >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000341934289.index >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log >> >> For comparison, the beats topic: >> >> bash-4.2$ cd ../beats-1 >> bash-4.2$ ls -ltr >> total 3212088 >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 >> 000188672034.snapshot >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 >> 000185224087.timeindex >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log >> -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 000185224087.index >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000188672034.index >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 >> 000188672034.timeindex >> -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log >> >> >> To give some context to why I'm even trying to reset the offsets, we had >> encountered a strange situation earlier today: >> >> 1) One of the brokers had a hardware failure, and had to be rebuilt from >> scratch (data partition was gone) >> 2) When it went down, we noticed a spike in lag in one particular consumer >> group - it seems to have reset its offset to an earlier point in time (but >> not the earliest offset of the topic); I have read other messages on this >> mailing list of users who experienced the same behavior with 0.11.0.0 >> 3) The broker was reinstalled and rejoined the cluster with the same >> broker.id (but with no data on it) - it rebalanced and eventually all >> replicas became synced and the cluster was functioning normally. >> 4) I then decided to bounce the same broker again to see if I can >> reproduce >> the issue I saw in #2 - and as soon as the broker was restarted, the exact >> same consumer group had its offset reset again and was lagging with >> millions of records behind the current offset. >> 5) I then tried to manually reset the consumer group's offset to a few >> minutes before I restarted the broker, only to discover this strange >> behavior where no matter which datetime value I provided, it kept >> resetting >> to the latest offset. >> >> >> On 25 October 2017 at 22:48, Ted Yu wrote: >> >> > Do you mind providing a bit more information ? >> > >> > Release of Kafka you use >> > >> > Any difference between data1_log and the other, normal topic ? >> > >> > Probably check the broker log where data1_log is hosted - see if there >> is >> > some clue. >> > >> > Thanks >> > >> > On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin >> > wrote: >> > >> > > I'm trying to use the kafka-consumer-groups.sh tool in order to >> rewind a >> > > consumer group's offset, however it seems to be returning the latest >> > offset >> > > regarding of the requested offset. >> > > >> > > You can see in the below example that two consecutive commands to >> reset >> > the >> > > offset to a specific point in time return different (increasing) >> offsets, >> > > which are actually the latest offsets for the topic. >> > > >> > > - The consumer group ("test_consumer") is a console consumer that was >> > > started with --from-beginning and terminated after a few seconds, just >> > > enough for it to commit its offsets. >> > > - The topic data1_log is very busy with thousands of incoming messages >> > per >> > > second >> > > - The datetime value provided is approx. 5 hours earlier than the >> current >> > > UTC time >> > > >> > > [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh >> > > --bootstrap-server localhost:9092 --reset-offsets --group >> test_con
Re: Failure to reset consumer offsets for specific topics
I think you are just missing the —execute flag. -hans > On Oct 25, 2017, at 1:24 PM, Ted Yu wrote: > > I wonder if you have hit KAFKA-5600. > > Is it possible that you try out 0.11.0.1 ? > > Thanks > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin wrote: >> >> I am using 0.11.0.0. >> >> There is no difference configuration-wise - both have 10 partitions and 2 >> replicas. There are no errors in the logs, but looking in the data folder >> it seems like Kafka is not updating the timeindex file for data1_log - >> notice how the timeindex file for the current log segment is not being >> updated. >> >> bash-4.2$ pwd >> /kafka/data/data1_log-1 >> bash-4.2$ ls -ltr | tail >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log >> -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 000337554984.index >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 >> 000341934289.timeindex >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 >> 000341934289.snapshot >> -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 >> 000339816017.timeindex >> -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 000339816017.index >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000341934289.index >> -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log >> >> For comparison, the beats topic: >> >> bash-4.2$ cd ../beats-1 >> bash-4.2$ ls -ltr >> total 3212088 >> -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint >> -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 >> 000188672034.snapshot >> -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 >> 000185224087.timeindex >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log >> -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 000185224087.index >> -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000188672034.index >> -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 >> 000188672034.timeindex >> -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log >> >> >> To give some context to why I'm even trying to reset the offsets, we had >> encountered a strange situation earlier today: >> >> 1) One of the brokers had a hardware failure, and had to be rebuilt from >> scratch (data partition was gone) >> 2) When it went down, we noticed a spike in lag in one particular consumer >> group - it seems to have reset its offset to an earlier point in time (but >> not the earliest offset of the topic); I have read other messages on this >> mailing list of users who experienced the same behavior with 0.11.0.0 >> 3) The broker was reinstalled and rejoined the cluster with the same >> broker.id (but with no data on it) - it rebalanced and eventually all >> replicas became synced and the cluster was functioning normally. >> 4) I then decided to bounce the same broker again to see if I can reproduce >> the issue I saw in #2 - and as soon as the broker was restarted, the exact >> same consumer group had its offset reset again and was lagging with >> millions of records behind the current offset. >> 5) I then tried to manually reset the consumer group's offset to a few >> minutes before I restarted the broker, only to discover this strange >> behavior where no matter which datetime value I provided, it kept resetting >> to the latest offset. >> >> >>> On 25 October 2017 at 22:48, Ted Yu wrote: >>> >>> Do you mind providing a bit more information ? >>> >>> Release of Kafka you use >>> >>> Any difference between data1_log and the other, normal topic ? >>> >>> Probably check the broker log where data1_log is hosted - see if there is >>> some clue. >>> >>> Thanks >>> >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin >>> wrote: >>> I'm trying to use the kafka-consumer-groups.sh tool in order to rewind >> a consumer group's offset, however it seems to be returning the latest >>> offset regarding of the requested offset. You can see in the below example that two consecutive commands to reset >>> the offset to a specific point in time return different (increasing) >> offsets, which are actually the latest offsets for the topic. - The consumer group ("test_consumer") is a console consumer that was started with --from-beginning and terminated after a few seconds, just enough for it to commit its offsets. - The topic data1_log is very busy with thousands of incoming messages >>> per second - The datetime value provided is approx. 5 hours earlier than the >> current UTC time [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --reset-offsets --group test_consumer --topic data1_log --to-datetime '2017-10-25T13:40:00.000' Note: This will only sho
Re: Failure to reset consumer offsets for specific topics
I wonder if you have hit KAFKA-5600. Is it possible that you try out 0.11.0.1 ? Thanks On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin wrote: > I am using 0.11.0.0. > > There is no difference configuration-wise - both have 10 partitions and 2 > replicas. There are no errors in the logs, but looking in the data folder > it seems like Kafka is not updating the timeindex file for data1_log - > notice how the timeindex file for the current log segment is not being > updated. > > bash-4.2$ pwd > /kafka/data/data1_log-1 > bash-4.2$ ls -ltr | tail > -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log > -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 000337554984.index > -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log > -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 > 000341934289.timeindex > -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 > 000341934289.snapshot > -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 > 000339816017.timeindex > -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 000339816017.index > -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint > -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000341934289.index > -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log > > For comparison, the beats topic: > > bash-4.2$ cd ../beats-1 > bash-4.2$ ls -ltr > total 3212088 > -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint > -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 > 000188672034.snapshot > -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 > 000185224087.timeindex > -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log > -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 000185224087.index > -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000188672034.index > -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 > 000188672034.timeindex > -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log > > > To give some context to why I'm even trying to reset the offsets, we had > encountered a strange situation earlier today: > > 1) One of the brokers had a hardware failure, and had to be rebuilt from > scratch (data partition was gone) > 2) When it went down, we noticed a spike in lag in one particular consumer > group - it seems to have reset its offset to an earlier point in time (but > not the earliest offset of the topic); I have read other messages on this > mailing list of users who experienced the same behavior with 0.11.0.0 > 3) The broker was reinstalled and rejoined the cluster with the same > broker.id (but with no data on it) - it rebalanced and eventually all > replicas became synced and the cluster was functioning normally. > 4) I then decided to bounce the same broker again to see if I can reproduce > the issue I saw in #2 - and as soon as the broker was restarted, the exact > same consumer group had its offset reset again and was lagging with > millions of records behind the current offset. > 5) I then tried to manually reset the consumer group's offset to a few > minutes before I restarted the broker, only to discover this strange > behavior where no matter which datetime value I provided, it kept resetting > to the latest offset. > > > On 25 October 2017 at 22:48, Ted Yu wrote: > > > Do you mind providing a bit more information ? > > > > Release of Kafka you use > > > > Any difference between data1_log and the other, normal topic ? > > > > Probably check the broker log where data1_log is hosted - see if there is > > some clue. > > > > Thanks > > > > On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin > > wrote: > > > > > I'm trying to use the kafka-consumer-groups.sh tool in order to rewind > a > > > consumer group's offset, however it seems to be returning the latest > > offset > > > regarding of the requested offset. > > > > > > You can see in the below example that two consecutive commands to reset > > the > > > offset to a specific point in time return different (increasing) > offsets, > > > which are actually the latest offsets for the topic. > > > > > > - The consumer group ("test_consumer") is a console consumer that was > > > started with --from-beginning and terminated after a few seconds, just > > > enough for it to commit its offsets. > > > - The topic data1_log is very busy with thousands of incoming messages > > per > > > second > > > - The datetime value provided is approx. 5 hours earlier than the > current > > > UTC time > > > > > > [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh > > > --bootstrap-server localhost:9092 --reset-offsets --group test_consumer > > > --topic data1_log --to-datetime '2017-10-25T13:40:00.000' > > > Note: This will only show information about consumers that use the Java > > > consumer API (non-ZooKeeper-based consumers). > > > > > > > > > TOPIC PARTITION NEW-OFFSET > > > data1_log
Re: Failure to reset consumer offsets for specific topics
I am using 0.11.0.0. There is no difference configuration-wise - both have 10 partitions and 2 replicas. There are no errors in the logs, but looking in the data folder it seems like Kafka is not updating the timeindex file for data1_log - notice how the timeindex file for the current log segment is not being updated. bash-4.2$ pwd /kafka/data/data1_log-1 bash-4.2$ ls -ltr | tail -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21 000337554984.log -rw-rw-r-- 1 ibiuser it 943616 Oct 25 01:21 000337554984.index -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38 000339816017.log -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 13:38 000341934289.timeindex -rw-rw-r-- 1 ibiuser it 10 Oct 25 13:38 000341934289.snapshot -rw-rw-r-- 1 ibiuser it 0 Oct 25 13:38 000339816017.timeindex -rw-rw-r-- 1 ibiuser it 566712 Oct 25 13:38 000339816017.index -rw-rw-r-- 1 ibiuser it 17 Oct 25 20:23 leader-epoch-checkpoint -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000341934289.index -rw-rw-r-- 1 ibiuser it 461590419 Oct 25 23:04 000341934289.log For comparison, the beats topic: bash-4.2$ cd ../beats-1 bash-4.2$ ls -ltr total 3212088 -rw-rw-r-- 1 ibiuser it 17 Oct 25 00:23 leader-epoch-checkpoint -rw-rw-r-- 1 ibiuser it 10 Oct 25 20:04 000188672034.snapshot -rw-rw-r-- 1 ibiuser it2773008 Oct 25 20:04 000185224087.timeindex -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04 000185224087.log -rw-rw-r-- 1 ibiuser it1967440 Oct 25 20:04 000185224087.index -rw-rw-r-- 1 ibiuser it 10485760 Oct 25 23:03 000188672034.index -rw-rw-r-- 1 ibiuser it 10485756 Oct 25 23:04 000188672034.timeindex -rw-rw-r-- 1 ibiuser it 50166645 Oct 25 23:04 000188672034.log To give some context to why I'm even trying to reset the offsets, we had encountered a strange situation earlier today: 1) One of the brokers had a hardware failure, and had to be rebuilt from scratch (data partition was gone) 2) When it went down, we noticed a spike in lag in one particular consumer group - it seems to have reset its offset to an earlier point in time (but not the earliest offset of the topic); I have read other messages on this mailing list of users who experienced the same behavior with 0.11.0.0 3) The broker was reinstalled and rejoined the cluster with the same broker.id (but with no data on it) - it rebalanced and eventually all replicas became synced and the cluster was functioning normally. 4) I then decided to bounce the same broker again to see if I can reproduce the issue I saw in #2 - and as soon as the broker was restarted, the exact same consumer group had its offset reset again and was lagging with millions of records behind the current offset. 5) I then tried to manually reset the consumer group's offset to a few minutes before I restarted the broker, only to discover this strange behavior where no matter which datetime value I provided, it kept resetting to the latest offset. On 25 October 2017 at 22:48, Ted Yu wrote: > Do you mind providing a bit more information ? > > Release of Kafka you use > > Any difference between data1_log and the other, normal topic ? > > Probably check the broker log where data1_log is hosted - see if there is > some clue. > > Thanks > > On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin > wrote: > > > I'm trying to use the kafka-consumer-groups.sh tool in order to rewind a > > consumer group's offset, however it seems to be returning the latest > offset > > regarding of the requested offset. > > > > You can see in the below example that two consecutive commands to reset > the > > offset to a specific point in time return different (increasing) offsets, > > which are actually the latest offsets for the topic. > > > > - The consumer group ("test_consumer") is a console consumer that was > > started with --from-beginning and terminated after a few seconds, just > > enough for it to commit its offsets. > > - The topic data1_log is very busy with thousands of incoming messages > per > > second > > - The datetime value provided is approx. 5 hours earlier than the current > > UTC time > > > > [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh > > --bootstrap-server localhost:9092 --reset-offsets --group test_consumer > > --topic data1_log --to-datetime '2017-10-25T13:40:00.000' > > Note: This will only show information about consumers that use the Java > > consumer API (non-ZooKeeper-based consumers). > > > > > > TOPIC PARTITION NEW-OFFSET > > data1_log 8 301485420 > > data1_log 1 342788637 > > data1_log 7 287621428 > > data1_log 3 268612266 > > data1_log 0 201860717 > > data1_log 9 202749553 > > data1_log 4
Re: Failure to reset consumer offsets for specific topics
Do you mind providing a bit more information ? Release of Kafka you use Any difference between data1_log and the other, normal topic ? Probably check the broker log where data1_log is hosted - see if there is some clue. Thanks On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin wrote: > I'm trying to use the kafka-consumer-groups.sh tool in order to rewind a > consumer group's offset, however it seems to be returning the latest offset > regarding of the requested offset. > > You can see in the below example that two consecutive commands to reset the > offset to a specific point in time return different (increasing) offsets, > which are actually the latest offsets for the topic. > > - The consumer group ("test_consumer") is a console consumer that was > started with --from-beginning and terminated after a few seconds, just > enough for it to commit its offsets. > - The topic data1_log is very busy with thousands of incoming messages per > second > - The datetime value provided is approx. 5 hours earlier than the current > UTC time > > [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh > --bootstrap-server localhost:9092 --reset-offsets --group test_consumer > --topic data1_log --to-datetime '2017-10-25T13:40:00.000' > Note: This will only show information about consumers that use the Java > consumer API (non-ZooKeeper-based consumers). > > > TOPIC PARTITION NEW-OFFSET > data1_log 8 301485420 > data1_log 1 342788637 > data1_log 7 287621428 > data1_log 3 268612266 > data1_log 0 201860717 > data1_log 9 202749553 > data1_log 4 188974032 > data1_log 6 234308481 > data1_log 2 263507741 > data1_log 5 232707238 > > [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh > --bootstrap-server localhost:9092 --reset-offsets --group test_consumer > --topic data1_log --to-datetime '2017-10-25T13:40:00.000' > Note: This will only show information about consumers that use the Java > consumer API (non-ZooKeeper-based consumers). > > > TOPIC PARTITION NEW-OFFSET > data1_log 8 301485491 > data1_log 1 342788779 > data1_log 7 287621534 > data1_log 3 268612364 > data1_log 0 201860796 > data1_log 9 202749620 > data1_log 4 188974068 > data1_log 6 234308564 > data1_log 2 263507823 > data1_log 5 232707293 > > This issue seems to be topic-specific - there is a different topic (also > very active) where the same command consistently returns the correct > offsets fixed in the time for the requested datetime. > > What could be the issue here? > > Thanks, > Dan >
Failure to reset consumer offsets for specific topics
I'm trying to use the kafka-consumer-groups.sh tool in order to rewind a consumer group's offset, however it seems to be returning the latest offset regarding of the requested offset. You can see in the below example that two consecutive commands to reset the offset to a specific point in time return different (increasing) offsets, which are actually the latest offsets for the topic. - The consumer group ("test_consumer") is a console consumer that was started with --from-beginning and terminated after a few seconds, just enough for it to commit its offsets. - The topic data1_log is very busy with thousands of incoming messages per second - The datetime value provided is approx. 5 hours earlier than the current UTC time [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --reset-offsets --group test_consumer --topic data1_log --to-datetime '2017-10-25T13:40:00.000' Note: This will only show information about consumers that use the Java consumer API (non-ZooKeeper-based consumers). TOPIC PARTITION NEW-OFFSET data1_log 8 301485420 data1_log 1 342788637 data1_log 7 287621428 data1_log 3 268612266 data1_log 0 201860717 data1_log 9 202749553 data1_log 4 188974032 data1_log 6 234308481 data1_log 2 263507741 data1_log 5 232707238 [admin@broker01] ~> /kafka/latest/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --reset-offsets --group test_consumer --topic data1_log --to-datetime '2017-10-25T13:40:00.000' Note: This will only show information about consumers that use the Java consumer API (non-ZooKeeper-based consumers). TOPIC PARTITION NEW-OFFSET data1_log 8 301485491 data1_log 1 342788779 data1_log 7 287621534 data1_log 3 268612364 data1_log 0 201860796 data1_log 9 202749620 data1_log 4 188974068 data1_log 6 234308564 data1_log 2 263507823 data1_log 5 232707293 This issue seems to be topic-specific - there is a different topic (also very active) where the same command consistently returns the correct offsets fixed in the time for the requested datetime. What could be the issue here? Thanks, Dan