[ 
https://issues.apache.org/jira/browse/KAFKA-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17585595#comment-17585595
 ] 

Nicholas Telford commented on KAFKA-10635:
------------------------------------------

Hi [~guozhang], there's not really any trace on the client-side, because the 
{{OutOfOrderSequenceException}} is thrown on the broker, and propagated to the 
client via the network protocol, so at the point it's thrown at the client, 
it's literally just deserializing the error from the broker. Consequently, 
there is no client-side stacktrace.

There is a stacktrace for the {{TaskMigratedException}} that wraps the 
{{OutOfOrderSequenceException}}, although I don't think it's particularly 
useful:

{{
org.apache.kafka.streams.errors.TaskMigratedException: Error encountered 
sending record to topic foo-bar-repartition for task 17_17 due to:
org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received 
an out of order sequence number.
Written offsets would not be recorded and no more records would be sent since 
the producer is fenced, indicating the task may be migrated out; it means all 
tasks belonging to this thread should be migrated.
        at 
org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:215)
        at 
org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$0(RecordCollectorImpl.java:196)
        at 
org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1418)
        at 
org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:273)
        at 
org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:234)
        at 
org.apache.kafka.clients.producer.internals.ProducerBatch.completeExceptionally(ProducerBatch.java:198)
        at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:758)
        at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:743)
        at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:695)
        at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:634)
        at 
org.apache.kafka.clients.producer.internals.Sender.lambda$null$1(Sender.java:575)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at 
org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:562)
        at java.base/java.lang.Iterable.forEach(Iterable.java:75)
        at 
org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:562)
        at 
org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$5(Sender.java:836)
        at 
org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
        at 
org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:583)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:575)
        at 
org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:328)
        at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.apache.kafka.common.errors.OutOfOrderSequenceException: The 
broker received an out of order sequence number.}}

This is why I was looking for a trace on the brokers, which I sadly have not 
yet been able to produce. I think I've fixed my broker logging now, so I'll try 
to re-create the issue and generate a stack-trace on the broker side when I 
have some time.

> Streams application fails with OutOfOrderSequenceException after rolling 
> restarts of brokers
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10635
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10635
>             Project: Kafka
>          Issue Type: Bug
>          Components: core, producer 
>    Affects Versions: 2.5.1
>            Reporter: Peeraya Maetasatidsuk
>            Priority: Blocker
>
> We are upgrading our brokers to version 2.5.1 (from 2.3.1) by performing a 
> rolling restart of the brokers after installing the new version. After the 
> restarts we notice one of our streams app (client version 2.4.1) fails with 
> OutOfOrderSequenceException:
>  
> {code:java}
> ERROR [2020-10-13 22:52:21,400] [com.aaa.bbb.ExceptionHandler] Unexpected 
> error. Record: a_record, destination topic: 
> topic-name-Aggregation-repartition 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
> ERROR [2020-10-13 22:52:21,413] 
> [org.apache.kafka.streams.processor.internals.AssignedTasks] stream-thread 
> [topic-name-StreamThread-1] Failed to commit stream task 1_39 due to the 
> following error: org.apache.kafka.streams.errors.StreamsException: task 
> [1_39] Abort sending since an error caught with a previous record (timestamp 
> 1602654659000) to topic topic-name-Aggregation-repartition due to 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.        at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:144)
>         at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>         at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:204)
>         at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1348)
>         at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)
>         at 
> org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:730) 
>        at 
> org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:716) 
>        at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:674)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:596)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74) 
>        at 
> org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:798)
>         at 
> org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)   
>      at 
> org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:569)
>         at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)        at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:335)   
>      at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:244)       
>  at java.base/java.lang.Thread.run(Thread.java:834)Caused by: 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
> {code}
> We see a corresponding error on the broker side:
> {code:java}
> [2020-10-13 22:52:21,398] ERROR [ReplicaManager broker=137636348] Error 
> processing append operation on partition 
> topic-name-Aggregation-repartition-52  
> (kafka.server.ReplicaManager)org.apache.kafka.common.errors.OutOfOrderSequenceException:
>  Out of order sequence number for producerId 2819098 at offset 1156041 in 
> partition topic-name-Aggregation-repartition-52: 29 (incoming seq. number), 
> -1 (current end sequence number)
> {code}
> We are able to reproduce this many times and it happens regardless of whether 
> the broker shutdown (at restart) is clean or unclean. However, when we 
> rollback the broker version to 2.3.1 from 2.5.1 and perform similar rolling 
> restarts, we don't see this error on the streams application at all. This is 
> blocking us from upgrading our broker version. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to