[ 
https://issues.apache.org/jira/browse/KAFKA-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762138#comment-17762138
 ] 

Kamal Chandraprakash commented on KAFKA-15414:
----------------------------------------------

[~fvisconte] 

Could you please take the latest trunk and try it out? Reopen the ticket if it 
doesn't work. Thanks!

> remote logs get deleted after partition reassignment
> ----------------------------------------------------
>
>                 Key: KAFKA-15414
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15414
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Luke Chen
>            Assignee: Kamal Chandraprakash
>            Priority: Blocker
>             Fix For: 3.6.0
>
>         Attachments: image-2023-08-29-11-12-58-875.png
>
>
> it seems I'm reaching that codepath when running reassignments on my cluster 
> and segment are deleted from remote store despite a huge retention (topic 
> created a few hours ago with 1000h retention).
> It seems to happen consistently on some partitions when reassigning but not 
> all partitions.
> My test:
> I have a test topic with 30 partition configured with 1000h global retention 
> and 2 minutes local retention
> I have a load tester producing to all partitions evenly
> I have consumer load tester consuming that topic
> I regularly reset offsets to earliest on my consumer to test backfilling from 
> tiered storage.
> My consumer was catching up consuming the backlog and I wanted to upscale my 
> cluster to speed up recovery: I upscaled my cluster from 3 to 12 brokers and 
> reassigned my test topic to all available brokers to have an even 
> leader/follower count per broker.
> When I triggered the reassignment, the consumer lag dropped on some of my 
> topic partitions:
> !image-2023-08-29-11-12-58-875.png|width=800,height=79! Screenshot 2023-08-28 
> at 20 57 09
> Later I tried to reassign back my topic to 3 brokers and the issue happened 
> again.
> Both times in my logs, I've seen a bunch of logs like:
> [RemoteLogManager=10005 partition=uR3O_hk3QRqsn4mPXGFoOw:loadtest11-17] 
> Deleted remote log segment RemoteLogSegmentId
> {topicIdPartition=uR3O_hk3QRqsn4mPXGFoOw:loadtest11-17, 
> id=Mk0chBQrTyKETTawIulQog}
> due to leader epoch cache truncation. Current earliest epoch: 
> EpochEntry(epoch=14, startOffset=46776780), segmentEndOffset: 46437796 and 
> segmentEpochs: [10]
> Looking at my s3 bucket. The segments prior to my reassignment have been 
> indeed deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to