Re: New Kafka Consumer : unknown member id
Hello Luke i have build a new kafka environment with kafka 2.8.0 the consumer is a new consumer set up to this environment is throwing the below error... the old consumers for the same applications for the same environment -2.8.0 is working fine.. . could you please advise 2021-11-02 12:25:24 DEBUG AbstractCoordinator:557 - [Consumer clientId=, groupId=] Attempt to join group failed due to unknown member id. On Fri, Oct 29, 2021 at 7:36 AM Luke Chen wrote: > Hi, > Which version of kafka client are you using? > I can't find this error message in the source code. > When googling this error message, it showed the error is in Kafka v0.9. > > Could you try to use the V3.0.0 and see if that issue still exist? > > Thank you. > Luke > > On Thu, Oct 28, 2021 at 11:15 PM Kafka Life > wrote: > > > Dear Kafka Experts > > > > We have set up a group.id (consumer ) = YYY > > But when tried to connect to kafka instance : i get this error message. I > > am sure this consumer (group id does not exist in kafka) .We user plain > > text protocol to connect to kafka 2.8.0. Please suggest how to resolve > this > > issue. > > > > DEBUG AbstractCoordinator:557 - [Consumer clientId=X, > groupId=YYY] > > Attempt to join group failed due to unknown member id. > > >
Re: Stream to KTable internals
The log clearly indicates that you hit enforced processing. We record the metric and log: Cf https://github.com/apache/kafka/blob/3.0.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/PartitionGroup.java#L194-L200 Not sure why the metric does not report it... Hence, the solution would be to increase `max.task.idle.ms` further to give Kafka Streams time to fetch the data. If might help to use DEBUG log to see for which partitions the consumer sends fetch requests and which partitions return data, to better understand the underlying behavior. -Matthias On 11/5/21 6:58 AM, Chad Preisler wrote: It seems like I have 2 options to work around this issue. - Keep the KTable and have another process running that puts the missed join message back on the event topic. - Switch to GlobalKTable. Any other solutions/workarounds are welcome. Thanks, Chad On Thu, Nov 4, 2021 at 11:43 AM Chad Preisler wrote: enforced-processing-total is zero for all missed join occurrences. I logged all the metrics out at the time my stream processed the missed join, so let me know if there are any other metics that would help. On Wed, Nov 3, 2021 at 9:21 PM Chad Preisler wrote: I'm not sure. When I ran with trace logging turned on I saw a bunch of messages like the ones below. Do those messages indicate "enforced-processing"? It gets logged right after the call to enforcedProcessingSensor.record. Continuing to process although some partitions are empty on the broker. There may be out-of-order processing for this task as a result. Partitions with local data: [status-5]. Partitions we gave up waiting for, with their corresponding deadlines: {event-5=1635881287722}. Configured max.task.idle.ms: 2000. Current wall-clock time: 1635881287750. Continuing to process although some partitions are empty on the broker. There may be out-of-order processing for this task as a result. Partitions with local data: [event-5]. Partitions we gave up waiting for, with their corresponding deadlines: {status-5=1635881272754}. Configured max.task.idle.ms: 2000. Current wall-clock time: 1635881277998. On Wed, Nov 3, 2021 at 6:11 PM Matthias J. Sax wrote: Can you check if the program ever does "enforced processing", ie, `max.task.idle.ms` passed, and we process despite an empty input buffer. Cf https://kafka.apache.org/documentation/#kafka_streams_task_monitoring As long as there is input data, we should never do "enforced processing" and the metric should stay at zero. -Matthias On 11/3/21 2:41 PM, Chad Preisler wrote: Just a quick update. Setting max.task.idle.ms to 1 (10 seconds) had no effect on this issue. On Tue, Nov 2, 2021 at 6:55 PM Chad Preisler wrote: No unfortunately it is not the case. The table record is written about 20 seconds before the stream record. I’ll crank up the time tomorrow and see what happens. On Tue, Nov 2, 2021 at 6:24 PM Matthias J. Sax wrote: Hard to tell, but as it seems that you can reproduce the issue, it might be worth a try to increase the idle time further. I guess one corner case for stream-table join that is not resolved yet is when stream and table record have the same timestamp... For this case, the table record might not be processed first. Could you hit this case? -Matthias On 11/2/21 3:13 PM, Chad Preisler wrote: Thank you for the information. We are using the Kafka 3.0 client library. We are able to reliably reproduce this issue in our test environment now. I removed my timestamp extractor, and I set the max.task.idle.ms to 2000. I also turned on trace logging for package org.apache.kafka.streams.processor.internals. To create the issue we stopped the application and ran enough data to create a lag of 400 messages. We saw 5 missed joins. From the stream-thread log messages we saw the event message, our stream missed the join, and then several milliseconds later we saw the stream-thread print out the status message. The stream-thread printed out our status message a total of 5 times. Given that only a few milliseconds passed between missing the join and the stream-thread printing the status message, would increasing the max.task.idle.ms help? Thanks, Chad On Mon, Nov 1, 2021 at 10:03 PM Matthias J. Sax wrote: Timestamp synchronization is not perfect, and as a matter of fact, we fixed a few gaps in 3.0.0 release. We actually hope, that we closed the last gaps in 3.0.0... *fingers-crossed* :) We are using a timestamp extractor that returns 0. You can do this, and it effectively "disables" timestamp synchronization as records on the KTable side don't have a timeline any longer. As a side effect it also allows you to "bootstrap" the table, as records with timestamp zero will always be processed first (as they are smaller). Of course, you also don't have time synchronization for "future" data and your program becomes non-deterministic if you reprocess old data. his seemed to be
Re: Stream to KTable internals
It seems like I have 2 options to work around this issue. - Keep the KTable and have another process running that puts the missed join message back on the event topic. - Switch to GlobalKTable. Any other solutions/workarounds are welcome. Thanks, Chad On Thu, Nov 4, 2021 at 11:43 AM Chad Preisler wrote: > enforced-processing-total is zero for all missed join occurrences. I > logged all the metrics out at the time my stream processed the missed join, > so let me know if there are any other metics that would help. > > On Wed, Nov 3, 2021 at 9:21 PM Chad Preisler > wrote: > >> I'm not sure. When I ran with trace logging turned on I saw a bunch of >> messages like the ones below. Do those messages indicate >> "enforced-processing"? It gets logged right after the call >> to enforcedProcessingSensor.record. >> >> Continuing to process although some partitions are empty on the broker. >> There may be out-of-order processing for this task as a result. Partitions >> with local data: [status-5]. Partitions we gave up waiting for, with their >> corresponding deadlines: {event-5=1635881287722}. Configured >> max.task.idle.ms: 2000. Current wall-clock time: 1635881287750. >> >> Continuing to process although some partitions are empty on the broker. >> There may be out-of-order processing for this task as a result. Partitions >> with local data: [event-5]. Partitions we gave up waiting for, with their >> corresponding deadlines: {status-5=1635881272754}. Configured >> max.task.idle.ms: 2000. Current wall-clock time: 1635881277998. >> >> On Wed, Nov 3, 2021 at 6:11 PM Matthias J. Sax wrote: >> >>> Can you check if the program ever does "enforced processing", ie, >>> `max.task.idle.ms` passed, and we process despite an empty input buffer. >>> >>> Cf https://kafka.apache.org/documentation/#kafka_streams_task_monitoring >>> >>> As long as there is input data, we should never do "enforced processing" >>> and the metric should stay at zero. >>> >>> >>> -Matthias >>> >>> On 11/3/21 2:41 PM, Chad Preisler wrote: >>> > Just a quick update. Setting max.task.idle.ms to 1 (10 seconds) >>> had no >>> > effect on this issue. >>> > >>> > On Tue, Nov 2, 2021 at 6:55 PM Chad Preisler >>> > wrote: >>> > >>> >> No unfortunately it is not the case. The table record is written >>> about 20 >>> >> seconds before the stream record. I’ll crank up the time tomorrow and >>> see >>> >> what happens. >>> >> >>> >> On Tue, Nov 2, 2021 at 6:24 PM Matthias J. Sax >>> wrote: >>> >> >>> >>> Hard to tell, but as it seems that you can reproduce the issue, it >>> might >>> >>> be worth a try to increase the idle time further. >>> >>> >>> >>> I guess one corner case for stream-table join that is not resolved >>> yet >>> >>> is when stream and table record have the same timestamp... For this >>> >>> case, the table record might not be processed first. >>> >>> >>> >>> Could you hit this case? >>> >>> >>> >>> >>> >>> -Matthias >>> >>> >>> >>> On 11/2/21 3:13 PM, Chad Preisler wrote: >>> Thank you for the information. We are using the Kafka 3.0 client >>> >>> library. >>> We are able to reliably reproduce this issue in our test environment >>> >>> now. I >>> removed my timestamp extractor, and I set the max.task.idle.ms to >>> >>> 2000. I >>> also turned on trace logging for package >>> org.apache.kafka.streams.processor.internals. >>> >>> To create the issue we stopped the application and ran enough data >>> to >>> create a lag of 400 messages. We saw 5 missed joins. >>> >>> From the stream-thread log messages we saw the event message, our >>> >>> stream >>> missed the join, and then several milliseconds later we saw the >>> stream-thread print out the status message. The stream-thread >>> printed >>> >>> out >>> our status message a total of 5 times. >>> >>> Given that only a few milliseconds passed between missing the join >>> and >>> >>> the >>> stream-thread printing the status message, would increasing the >>> max.task.idle.ms help? >>> >>> Thanks, >>> Chad >>> >>> On Mon, Nov 1, 2021 at 10:03 PM Matthias J. Sax >>> >>> wrote: >>> >>> > Timestamp synchronization is not perfect, and as a matter of fact, >>> we >>> > fixed a few gaps in 3.0.0 release. We actually hope, that we >>> closed the >>> > last gaps in 3.0.0... *fingers-crossed* :) >>> > >>> >> We are using a timestamp extractor that returns 0. >>> > >>> > You can do this, and it effectively "disables" timestamp >>> >>> synchronization >>> > as records on the KTable side don't have a timeline any longer. As >>> a >>> > side effect it also allows you to "bootstrap" the table, as records >>> >>> with >>> > timestamp zero will always be processed first (as they are >>> smaller). Of >>> > course, you also don't have time synchronization for "future" data >>> and >>> > your program becomes non-deterministic if you re