[jira] [Commented] (KAFKA-2303) Fix for KAFKA-2235 LogCleaner offset map overflow causes another compaction failures

2015-07-27 Thread Alexander Demidko (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643136#comment-14643136
 ] 

Alexander Demidko commented on KAFKA-2303:
--

I think in our case we had too many unique keys per partition, so making 
compactions to happen more frequently will not ultimately solve the issue.
Increasing partitions number should help, but it requires more careful planning 
about the compacted topic overall data volumes.

 Fix for KAFKA-2235 LogCleaner offset map overflow causes another compaction 
 failures
 

 Key: KAFKA-2303
 URL: https://issues.apache.org/jira/browse/KAFKA-2303
 Project: Kafka
  Issue Type: Bug
  Components: core, log
Affects Versions: 0.8.2.1
Reporter: Alexander Demidko
Assignee: Jay Kreps
 Fix For: 0.8.3


 We have rolled out the patch for KAFKA-2235 to our kafka cluster, and 
 recently instead of 
 {code}
 kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
 java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
 entry to a full offset map. 
 {code}
 we started to see 
 {code}
 kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
 java.lang.IllegalArgumentException: requirement failed: 131390902 messages in 
 segment topic-name-cgstate-8/79840768.log but offset map can 
 fit only 80530612. You can increase log.cleaner.dedupe.buffer.size or 
 decrease log.cleaner.threads
 {code}
 So, we had to roll it back to avoid disk depletion although I'm not sure if 
 it needs to be rolled back in trunk. This patch applies more strict checks 
 than were in place before: even if there is only one unique key for a 
 segment, cleanup will fail if this segment is too big. 
 Does it make sense to eliminate a limit for the offset map slots count, for 
 example to use an offset map backed by a memory mapped file?
 The limit of 80530612 slots comes from memory / bytesPerEntry, where memory 
 is Int.MaxValue (we use only one cleaner thread) and bytesPerEntry is 8 + 
 digest hash size. Might be wrong, but it seems if the overall number of 
 unique keys per partition is more than 80M slots in an OffsetMap, compaction 
 will always fail and cleaner thread will die. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2303) Fix for KAFKA-2235 LogCleaner offset map overflow causes another compaction failures

2015-07-10 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622767#comment-14622767
 ] 

Jiangjie Qin commented on KAFKA-2303:
-

Would the following help solve the issue?
1. Make log.cleaner.min.cleanable.ratio to be smaller
2. Make log.cleaner.backoff.ms smaller
It is supposedly make each cleaning scan over less records but do the cleaning 
more frequently. That should help a little bit.

 Fix for KAFKA-2235 LogCleaner offset map overflow causes another compaction 
 failures
 

 Key: KAFKA-2303
 URL: https://issues.apache.org/jira/browse/KAFKA-2303
 Project: Kafka
  Issue Type: Bug
  Components: core, log
Affects Versions: 0.8.2.1
Reporter: Alexander Demidko
Assignee: Jay Kreps
 Fix For: 0.8.3


 We have rolled out the patch for KAFKA-2235 to our kafka cluster, and 
 recently instead of 
 {code}
 kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
 java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
 entry to a full offset map. 
 {code}
 we started to see 
 {code}
 kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
 java.lang.IllegalArgumentException: requirement failed: 131390902 messages in 
 segment topic-name-cgstate-8/79840768.log but offset map can 
 fit only 80530612. You can increase log.cleaner.dedupe.buffer.size or 
 decrease log.cleaner.threads
 {code}
 So, we had to roll it back to avoid disk depletion although I'm not sure if 
 it needs to be rolled back in trunk. This patch applies more strict checks 
 than were in place before: even if there is only one unique key for a 
 segment, cleanup will fail if this segment is too big. 
 Does it make sense to eliminate a limit for the offset map slots count, for 
 example to use an offset map backed by a memory mapped file?
 The limit of 80530612 slots comes from memory / bytesPerEntry, where memory 
 is Int.MaxValue (we use only one cleaner thread) and bytesPerEntry is 8 + 
 digest hash size. Might be wrong, but it seems if the overall number of 
 unique keys per partition is more than 80M slots in an OffsetMap, compaction 
 will always fail and cleaner thread will die. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)