Re: New Kafka Consumer : unknown member id

2021-11-05 Thread Kafka Life
Hello Luke

i have build a new kafka environment with kafka 2.8.0

the consumer is a new consumer set up to this environment is throwing the
below error... the old consumers for the same applications for the same
environment -2.8.0 is working fine.. .

could you please advise

2021-11-02 12:25:24 DEBUG AbstractCoordinator:557 - [Consumer
clientId=, groupId=] Attempt to join group failed due to unknown
member id.

On Fri, Oct 29, 2021 at 7:36 AM Luke Chen  wrote:

> Hi,
> Which version of kafka client are you using?
> I can't find this error message in the source code.
> When googling this error message, it showed the error is in Kafka v0.9.
>
> Could you try to use the V3.0.0 and see if that issue still exist?
>
> Thank you.
> Luke
>
> On Thu, Oct 28, 2021 at 11:15 PM Kafka Life 
> wrote:
>
> > Dear Kafka Experts
> >
> > We have set up a group.id (consumer ) = YYY
> > But when tried to connect to kafka instance : i get this error message. I
> > am sure this consumer (group id does not exist in kafka) .We user plain
> > text protocol to connect to kafka 2.8.0. Please suggest how to resolve
> this
> > issue.
> >
> > DEBUG AbstractCoordinator:557 - [Consumer clientId=X,
> groupId=YYY]
> > Attempt to join group failed due to unknown member id.
> >
>


Re: Stream to KTable internals

2021-11-05 Thread Matthias J. Sax
The log clearly indicates that you hit enforced processing. We record 
the metric and log:


Cf 
https://github.com/apache/kafka/blob/3.0.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/PartitionGroup.java#L194-L200


Not sure why the metric does not report it...

Hence, the solution would be to increase `max.task.idle.ms` further to 
give Kafka Streams time to fetch the data.


If might help to use DEBUG log to see for which partitions the consumer 
sends fetch requests and which partitions return data, to better 
understand the underlying behavior.



-Matthias

On 11/5/21 6:58 AM, Chad Preisler wrote:

It seems like I have 2 options to work around this issue.


- Keep the KTable and have another process running that puts the missed
join message back on the event topic.
- Switch to GlobalKTable.

Any other solutions/workarounds are welcome.

Thanks,
Chad

On Thu, Nov 4, 2021 at 11:43 AM Chad Preisler 
wrote:


enforced-processing-total is zero for all missed join occurrences. I
logged all the metrics out at the time my stream processed the missed join,
so let me know if there are any other metics that would help.

On Wed, Nov 3, 2021 at 9:21 PM Chad Preisler 
wrote:


I'm not sure. When I ran with trace logging turned on I saw a bunch of
messages like the ones below. Do those messages indicate
"enforced-processing"? It gets logged right after the call
to enforcedProcessingSensor.record.

Continuing to process although some partitions are empty on the broker.
There may be out-of-order processing for this task as a result. Partitions
with local data: [status-5]. Partitions we gave up waiting for, with their
corresponding deadlines: {event-5=1635881287722}. Configured
max.task.idle.ms: 2000. Current wall-clock time: 1635881287750.

Continuing to process although some partitions are empty on the broker.
There may be out-of-order processing for this task as a result. Partitions
with local data: [event-5]. Partitions we gave up waiting for, with their
corresponding deadlines: {status-5=1635881272754}. Configured
max.task.idle.ms: 2000. Current wall-clock time: 1635881277998.

On Wed, Nov 3, 2021 at 6:11 PM Matthias J. Sax  wrote:


Can you check if the program ever does "enforced processing", ie,
`max.task.idle.ms` passed, and we process despite an empty input buffer.

Cf https://kafka.apache.org/documentation/#kafka_streams_task_monitoring

As long as there is input data, we should never do "enforced processing"
and the metric should stay at zero.


-Matthias

On 11/3/21 2:41 PM, Chad Preisler wrote:

Just a quick update. Setting max.task.idle.ms to 1 (10 seconds)

had no

effect on this issue.

On Tue, Nov 2, 2021 at 6:55 PM Chad Preisler 
wrote:


No unfortunately it is not the case. The table record is written

about 20

seconds before the stream record. I’ll crank up the time tomorrow and

see

what happens.

On Tue, Nov 2, 2021 at 6:24 PM Matthias J. Sax 

wrote:



Hard to tell, but as it seems that you can reproduce the issue, it

might

be worth a try to increase the idle time further.

I guess one corner case for stream-table join that is not resolved

yet

is when stream and table record have the same timestamp... For this
case, the table record might not be processed first.

Could you hit this case?


-Matthias

On 11/2/21 3:13 PM, Chad Preisler wrote:

Thank you for the information. We are using the Kafka 3.0 client

library.

We are able to reliably reproduce this issue in our test environment

now. I

removed my timestamp extractor, and I set the max.task.idle.ms to

2000. I

also turned on trace logging for package
org.apache.kafka.streams.processor.internals.

To create the issue we stopped the application and ran enough data

to

create a lag of 400 messages. We saw 5 missed joins.

   From the stream-thread log messages we saw the event message, our

stream

missed the join, and then several milliseconds later we saw the
stream-thread print out the status message. The stream-thread

printed

out

our status message a total of 5 times.

Given that only a few milliseconds passed between missing the join

and

the

stream-thread printing the status message, would increasing the
max.task.idle.ms help?

Thanks,
Chad

On Mon, Nov 1, 2021 at 10:03 PM Matthias J. Sax 

wrote:



Timestamp synchronization is not perfect, and as a matter of fact,

we

fixed a few gaps in 3.0.0 release. We actually hope, that we

closed the

last gaps in 3.0.0... *fingers-crossed* :)


We are using a timestamp extractor that returns 0.


You can do this, and it effectively "disables" timestamp

synchronization

as records on the KTable side don't have a timeline any longer. As

a

side effect it also allows you to "bootstrap" the table, as records

with

timestamp zero will always be processed first (as they are

smaller). Of

course, you also don't have time synchronization for "future" data

and

your program becomes non-deterministic if you reprocess old data.


his seemed to be 

Re: Stream to KTable internals

2021-11-05 Thread Chad Preisler
It seems like I have 2 options to work around this issue.


   - Keep the KTable and have another process running that puts the missed
   join message back on the event topic.
   - Switch to GlobalKTable.

Any other solutions/workarounds are welcome.

Thanks,
Chad

On Thu, Nov 4, 2021 at 11:43 AM Chad Preisler 
wrote:

> enforced-processing-total is zero for all missed join occurrences. I
> logged all the metrics out at the time my stream processed the missed join,
> so let me know if there are any other metics that would help.
>
> On Wed, Nov 3, 2021 at 9:21 PM Chad Preisler 
> wrote:
>
>> I'm not sure. When I ran with trace logging turned on I saw a bunch of
>> messages like the ones below. Do those messages indicate
>> "enforced-processing"? It gets logged right after the call
>> to enforcedProcessingSensor.record.
>>
>> Continuing to process although some partitions are empty on the broker.
>> There may be out-of-order processing for this task as a result. Partitions
>> with local data: [status-5]. Partitions we gave up waiting for, with their
>> corresponding deadlines: {event-5=1635881287722}. Configured
>> max.task.idle.ms: 2000. Current wall-clock time: 1635881287750.
>>
>> Continuing to process although some partitions are empty on the broker.
>> There may be out-of-order processing for this task as a result. Partitions
>> with local data: [event-5]. Partitions we gave up waiting for, with their
>> corresponding deadlines: {status-5=1635881272754}. Configured
>> max.task.idle.ms: 2000. Current wall-clock time: 1635881277998.
>>
>> On Wed, Nov 3, 2021 at 6:11 PM Matthias J. Sax  wrote:
>>
>>> Can you check if the program ever does "enforced processing", ie,
>>> `max.task.idle.ms` passed, and we process despite an empty input buffer.
>>>
>>> Cf https://kafka.apache.org/documentation/#kafka_streams_task_monitoring
>>>
>>> As long as there is input data, we should never do "enforced processing"
>>> and the metric should stay at zero.
>>>
>>>
>>> -Matthias
>>>
>>> On 11/3/21 2:41 PM, Chad Preisler wrote:
>>> > Just a quick update. Setting max.task.idle.ms to 1 (10 seconds)
>>> had no
>>> > effect on this issue.
>>> >
>>> > On Tue, Nov 2, 2021 at 6:55 PM Chad Preisler 
>>> > wrote:
>>> >
>>> >> No unfortunately it is not the case. The table record is written
>>> about 20
>>> >> seconds before the stream record. I’ll crank up the time tomorrow and
>>> see
>>> >> what happens.
>>> >>
>>> >> On Tue, Nov 2, 2021 at 6:24 PM Matthias J. Sax 
>>> wrote:
>>> >>
>>> >>> Hard to tell, but as it seems that you can reproduce the issue, it
>>> might
>>> >>> be worth a try to increase the idle time further.
>>> >>>
>>> >>> I guess one corner case for stream-table join that is not resolved
>>> yet
>>> >>> is when stream and table record have the same timestamp... For this
>>> >>> case, the table record might not be processed first.
>>> >>>
>>> >>> Could you hit this case?
>>> >>>
>>> >>>
>>> >>> -Matthias
>>> >>>
>>> >>> On 11/2/21 3:13 PM, Chad Preisler wrote:
>>>  Thank you for the information. We are using the Kafka 3.0 client
>>> >>> library.
>>>  We are able to reliably reproduce this issue in our test environment
>>> >>> now. I
>>>  removed my timestamp extractor, and I set the max.task.idle.ms to
>>> >>> 2000. I
>>>  also turned on trace logging for package
>>>  org.apache.kafka.streams.processor.internals.
>>> 
>>>  To create the issue we stopped the application and ran enough data
>>> to
>>>  create a lag of 400 messages. We saw 5 missed joins.
>>> 
>>>    From the stream-thread log messages we saw the event message, our
>>> >>> stream
>>>  missed the join, and then several milliseconds later we saw the
>>>  stream-thread print out the status message. The stream-thread
>>> printed
>>> >>> out
>>>  our status message a total of 5 times.
>>> 
>>>  Given that only a few milliseconds passed between missing the join
>>> and
>>> >>> the
>>>  stream-thread printing the status message, would increasing the
>>>  max.task.idle.ms help?
>>> 
>>>  Thanks,
>>>  Chad
>>> 
>>>  On Mon, Nov 1, 2021 at 10:03 PM Matthias J. Sax 
>>> >>> wrote:
>>> 
>>> > Timestamp synchronization is not perfect, and as a matter of fact,
>>> we
>>> > fixed a few gaps in 3.0.0 release. We actually hope, that we
>>> closed the
>>> > last gaps in 3.0.0... *fingers-crossed* :)
>>> >
>>> >> We are using a timestamp extractor that returns 0.
>>> >
>>> > You can do this, and it effectively "disables" timestamp
>>> >>> synchronization
>>> > as records on the KTable side don't have a timeline any longer. As
>>> a
>>> > side effect it also allows you to "bootstrap" the table, as records
>>> >>> with
>>> > timestamp zero will always be processed first (as they are
>>> smaller). Of
>>> > course, you also don't have time synchronization for "future" data
>>> and
>>> > your program becomes non-deterministic if you re