[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864128#comment-15864128 ] Andrew Olson commented on KAFKA-1379: - [~hachikuji] Jason, could you confirm if this bug has been fixed? > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832155#comment-15832155 ] Andrew Olson commented on KAFKA-1379: - [~jjkoshy] / [~becket_qin] should this Jira now be closed as a duplicate of KAFKA-3163? https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-Enforcetimebasedlogretention > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310068#comment-15310068 ] Luca Toscano commented on KAFKA-1379: - Hi Moritz, thanks a lot for pointing us to this Jira in users@. At the moment we use a similar trick to resolve disk partitions filling up (retention.ms): https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Kafka/Administration#Temporarily_Modify_Per_Topic_Retention_Settings I also opened a Phabricator task to track this problem https://phabricator.wikimedia.org/T136690 retention.bytes is definitely worth to try, but is there anything else that can mitigate this issue? > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301936#comment-15301936 ] Moritz Siuts commented on KAFKA-1379: - >From the user-mailinglist: {quote} We’ve recently upgraded to 0.9. In 0.8, when we restarted a broker, data log file mtimes were not changed. In 0.9, any data log file that was on disk before the broker has it’s mtime modified to the time of the broker restart. {quote} A workaround can be to set {{retention.bytes}} on a topic level, like this: {noformat} ./bin/kafka-topics.sh --zookeeper X.X.X.X:2181/kafka -alter --config retention.bytes=500 –topic my_topic {noformat} The settings controls the max size in bytes of a partition oft he specified topic. So you can find a good size by checking the size of a partition with {{du -b}} and use this value. > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903212#comment-14903212 ] Xavier Léauté commented on KAFKA-1379: -- This is a huge issue for us as well, since it requires we keep double the disk capacity on hand, in case one of our brokers or disks fails, which happens relatively often at our scale. Alternatively, we have to go in and remove expired segments by hand, by comparing replicated segments with the partition leader, before disks run out of space. > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334253#comment-14334253 ] Joel Koshy commented on KAFKA-1379: --- We have been thinking through various alternatives and this is included in a proposal here: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Enriched+Message+Metadata > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1379) Partition reassignment resets clock for time-based retention
[ https://issues.apache.org/jira/browse/KAFKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1438#comment-1438 ] Moritz Siuts commented on KAFKA-1379: - This also happens when a broker dies and loses it's data. When the broker comes back without any data it will use more and more disk space until it doubles the used disk space until the retention kicks in and the usage drops to normal. IMHO this is pretty bad for disaster scenarios, so I would like to see a higher prio on this. > Partition reassignment resets clock for time-based retention > > > Key: KAFKA-1379 > URL: https://issues.apache.org/jira/browse/KAFKA-1379 > Project: Kafka > Issue Type: Bug >Reporter: Joel Koshy > > Since retention is driven off mod-times reassigned partitions will result in > data that has been on a leader to be retained for another full retention > cycle. E.g., if retention is seven days and you reassign partitions on the > sixth day then those partitions will remain on the replicas for another > seven days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)