[ 
https://issues.apache.org/jira/browse/ATLAS-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4155:
-----------------------------------
    Attachment: ATLAS-4155-Kafka-commit-supplied-offset.patch

> NotificationHookConsumer: Large Compressed Message Processing Problem
> ---------------------------------------------------------------------
>
>                 Key: ATLAS-4155
>                 URL: https://issues.apache.org/jira/browse/ATLAS-4155
>             Project: Atlas
>          Issue Type: Bug
>            Reporter: Ashutosh Mestry
>            Assignee: Ashutosh Mestry
>            Priority: Major
>         Attachments: ATLAS-4155-Kafka-commit-supplied-offset.patch
>
>
> *Background*
> Notification messages can be large in size. To get over Kafka's limitation on 
> message size, Atlas has compressed and split messages. If message size goes 
> beyond stipulated threshold, the message is compressed. If compressed message 
> goes beyond the size, it is split into multiple messages.
> *Situation*
> Consider a message that is so large that uncompressing it takes longer than 
> Kafka's timeout for message. This causes the problem where the large message 
> offset is not committed in time and that causes Kafka to present the same 
> message again.
> Message Description:
> Number of splits: 8
> Compressed message size: 7,452,640
> Uncompressed message size: 520,803,946
> Time taken to uncompress and stitch messages: > 90 seconds
>  
> Sequence:
> 2021-02-10 14:57:24,221: first message received
> 2021-02-10 14:58:36,052: all splits combined – 72 seconds
> 2021-02-10 15:01:06,971: message processing completed – 90 seconds
> 2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first 
> message: 197 seconds
> 2021-02-10 15:01:19,857: attempt #2: first message received
> 2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
> 2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since 
> first message: 205 seconds
> Back to #5
> *Solution*
> Maintain last offset received. If the same offset is presented, commit the 
> offset and move on to the next message.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to