Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:39 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description (updated) --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Merge remote-tracking branch 'origin/trunk' into KAFKA-1670 Diffs (updated) - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 7, 2014, 12:42 a.m., Jay Kreps wrote: core/src/main/scala/kafka/log/Log.scala, line 502 https://reviews.apache.org/r/26346/diff/3/?file=714142#file714142line502 It is a bit subtle that you are checking for overflow this way. What we mean to check is just that there is sufficient room in the segment for this message, which I think we can do by checking: segment.size config.segmentSize - messagesSize Thanks Jay and Jun for the review and suggesstions. Please check the latest patch. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55619 --- On Oct. 7, 2014, 8:39 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:39 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Merge remote-tracking branch 'origin/trunk' into KAFKA-1670 Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description (updated) --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs (updated) - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55720 --- core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala https://reviews.apache.org/r/26346/#comment96114 We need to add this exception to ErrorMapping and Errors. We also need to add this class to org.apache.kafka.common.errors in the client. core/src/test/scala/unit/kafka/log/LogTest.scala https://reviews.apache.org/r/26346/#comment96104 By increasing the segment size to 100, does the log still roll on every message as indicated by the comment? - Jun Rao On Oct. 7, 2014, 8:49 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 7, 2014, 11:03 p.m., Jun Rao wrote: core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala, line 20 https://reviews.apache.org/r/26346/diff/5/?file=714846#file714846line20 We need to add this exception to ErrorMapping and Errors. We also need to add this class to org.apache.kafka.common.errors in the client. sorry I missed it On Oct. 7, 2014, 11:03 p.m., Jun Rao wrote: core/src/test/scala/unit/kafka/log/LogTest.scala, line 242 https://reviews.apache.org/r/26346/diff/5/?file=714849#file714849line242 By increasing the segment size to 100, does the log still roll on every message as indicated by the comment? yes it rolls on every messageset. I can add a assert to test that if it required. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55720 --- On Oct. 7, 2014, 8:49 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 7, 2014, 11:03 p.m., Jun Rao wrote: core/src/test/scala/unit/kafka/log/LogTest.scala, line 242 https://reviews.apache.org/r/26346/diff/5/?file=714849#file714849line242 By increasing the segment size to 100, does the log still roll on every message as indicated by the comment? Sriharsha Chintalapani wrote: yes it rolls on every messageset. I can add a assert to test that if it required. Will it? In each append, we add 2 messages with a total of 10 bytes. If we add a 10 byte per message overhead, with compression, it seems both message sets can fit in the same log segment of 100 bytes? - Jun --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55720 --- On Oct. 7, 2014, 8:49 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 7, 2014, 11:03 p.m., Jun Rao wrote: core/src/test/scala/unit/kafka/log/LogTest.scala, line 242 https://reviews.apache.org/r/26346/diff/5/?file=714849#file714849line242 By increasing the segment size to 100, does the log still roll on every message as indicated by the comment? Sriharsha Chintalapani wrote: yes it rolls on every messageset. I can add a assert to test that if it required. Jun Rao wrote: Will it? In each append, we add 2 messages with a total of 10 bytes. If we add a 10 byte per message overhead, with compression, it seems both message sets can fit in the same log segment of 100 bytes? I am using validMessages.sizeInBytes which is showing the size of ByteBufferedMessageSet new ByteBufferMessageSet(DefaultCompressionCodec, new Message(hello.getBytes), new Message(there.getBytes)) as 83 - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55720 --- On Oct. 7, 2014, 8:49 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 7, 2014, 11:03 p.m., Jun Rao wrote: core/src/test/scala/unit/kafka/log/LogTest.scala, line 242 https://reviews.apache.org/r/26346/diff/5/?file=714849#file714849line242 By increasing the segment size to 100, does the log still roll on every message as indicated by the comment? Sriharsha Chintalapani wrote: yes it rolls on every messageset. I can add a assert to test that if it required. Jun Rao wrote: Will it? In each append, we add 2 messages with a total of 10 bytes. If we add a 10 byte per message overhead, with compression, it seems both message sets can fit in the same log segment of 100 bytes? Sriharsha Chintalapani wrote: I am using validMessages.sizeInBytes which is showing the size of ByteBufferedMessageSet new ByteBufferMessageSet(DefaultCompressionCodec, new Message(hello.getBytes), new Message(there.getBytes)) as 83 Great. Then, this is fine. - Jun --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55720 --- On Oct. 7, 2014, 8:49 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 7, 2014, 8:49 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 8, 2014, 1:39 a.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description (updated) --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs (updated) - clients/src/main/java/org/apache/kafka/common/errors/RecordBatchTooLargeException.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/protocol/Errors.java d434f420ad63406ee2a2fde9435762ae027d85f3 core/src/main/scala/kafka/common/ErrorMapping.scala 3fae7910e4ce17bc8325887a046f383e0c151d44 core/src/main/scala/kafka/common/MessageSetSizeTooLargeException.scala PRE-CREATION core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogManagerTest.scala 59bd8a981b3fb8595dd6e790a30071092978a88d core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 9556ed92c61ffee5423be962bcdbe64c71e1f2fa Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 5, 2014, 11:35 p.m., Jun Rao wrote: core/src/main/scala/kafka/log/Log.scala, line 499 https://reviews.apache.org/r/26346/diff/2/?file=713902#file713902line499 Would it be simpler to just do the following? if (segment.size + messageSize (long) config.segmentSize) that would cause it roll everytime a message batch is appended . Lets say the segment.size is 10 and messagesSize is 10 and config.segmentSize is 2147483647 the above check will pass and roll the current segment without reaching the config.segmentSize. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55479 --- On Oct. 5, 2014, 3:17 a.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 5, 2014, 3:17 a.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs (updated) - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 5, 2014, 11:35 p.m., Jun Rao wrote: core/src/test/scala/unit/kafka/log/LogTest.scala, lines 113-129 https://reviews.apache.org/r/26346/diff/2/?file=713903#file713903line113 Appending 2GB of data in a unit test is probably too long. We can probably just manually validate the fix and skip the unit test. The concern I have with manual validation only is that over time with enough of these bugs, we don't any coverage for regression testing. Any chance this can at least be a system test? - Neha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55479 --- On Oct. 6, 2014, 4:48 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55526 --- core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95904 We probably should enforce that a segment is never larger than config.segmentSize. So, if messageSize is larger than config.segmentSize, we should just throw an exception. Once we do that, it seems that we can just use the following check to cover both conditions. if (segment.size + messageSize (long) config.segmentSize We likely need to adjust the comment above accordingly. - Jun Rao On Oct. 6, 2014, 4:48 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 6, 2014, 5:24 p.m., Jun Rao wrote: core/src/main/scala/kafka/log/Log.scala, line 502 https://reviews.apache.org/r/26346/diff/3/?file=714142#file714142line502 We probably should enforce that a segment is never larger than config.segmentSize. So, if messageSize is larger than config.segmentSize, we should just throw an exception. Once we do that, it seems that we can just use the following check to cover both conditions. if (segment.size + messageSize (long) config.segmentSize We likely need to adjust the comment above accordingly. Sorry I am not able follow . can you please elaborate on this So, if messageSize is larger than config.segmentSize,. Here the issue is not the messageSize is larger than the config.segmentSize. Currently we only roll when segment.size is greater than the config.segmentSize and the edge case here is if the config.segmentSize is Int.MaxValue and the current segment.size is Int.MaxValue - 1 we still wouldn't roll the segment and append the current batch to the same segment and next time we check segment.size is overflown will be negative and still fail to pass the check segment.size config.segmentSize and we keep appending to the same LogSegment. if (segment.size + messageSize (long) config.segmentSize This condition wouldn't work since segment.size is Int and if its value is anywhere closer to Int.MaxValue adding the current messages size will cause it overflown. we can change the above condition to if (segment.size.toLong + messageSize config.segmentSize) and changing the comment to LogSegment will be rolled if segment.size + messagesBatch.size is greater than config.segmentSize. Please let me know if these changes looks good. Thanks. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55526 --- On Oct. 6, 2014, 4:48 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 6, 2014, 5:24 p.m., Jun Rao wrote: core/src/main/scala/kafka/log/Log.scala, line 502 https://reviews.apache.org/r/26346/diff/3/?file=714142#file714142line502 We probably should enforce that a segment is never larger than config.segmentSize. So, if messageSize is larger than config.segmentSize, we should just throw an exception. Once we do that, it seems that we can just use the following check to cover both conditions. if (segment.size + messageSize (long) config.segmentSize We likely need to adjust the comment above accordingly. Sriharsha Chintalapani wrote: Sorry I am not able follow . can you please elaborate on this So, if messageSize is larger than config.segmentSize,. Here the issue is not the messageSize is larger than the config.segmentSize. Currently we only roll when segment.size is greater than the config.segmentSize and the edge case here is if the config.segmentSize is Int.MaxValue and the current segment.size is Int.MaxValue - 1 we still wouldn't roll the segment and append the current batch to the same segment and next time we check segment.size is overflown will be negative and still fail to pass the check segment.size config.segmentSize and we keep appending to the same LogSegment. if (segment.size + messageSize (long) config.segmentSize This condition wouldn't work since segment.size is Int and if its value is anywhere closer to Int.MaxValue adding the current messages size will cause it overflown. we can change the above condition to if (segment.size.toLong + messageSize config.segmentSize) and changing the comment to LogSegment will be rolled if segment.size + messagesBatch.size is greater than config.segmentSize. Please let me know if these changes looks good. Thanks. Yes, my point is that the current implementation of config.segmentSize is a bit misleading. It actually allows a log segment to be larger than config.segmentSize. It's easier to understand if we disallow that. Also, if we never allow a segment to go beyond config.segmentSize, it will never overflow since config.segmentSize is capped at 2G. The following check that you suggested will suffice. if (segment.size.toLong + messageSize config.segmentSize) Then the issue is what happens when the appended message set itself is larger than config.segmentSize. If you just have the above check, you will still allow a segment to have a size larger than config.segmentSize. Another issue is that if the active segment is empty, we will end up rolling a new segment with the same name, which will confuse the broker. In this case, it will be easier to reject the message set by throwing an exception. - Jun --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55526 --- On Oct. 6, 2014, 4:48 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55619 --- core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95982 I don't know if this comment is very intuitive. Let's just document that messageSize is the size of the message about to be appended and that this method will roll the log if either (1) the log is full, (2) the max time has elapsed, or (3) the index is full. core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95983 It is a bit subtle that you are checking for overflow this way. What we mean to check is just that there is sufficient room in the segment for this message, which I think we can do by checking: segment.size config.segmentSize - messagesSize - Jay Kreps On Oct. 6, 2014, 4:48 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 6, 2014, 4:48 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55479 --- Thanks for the patch. Some comments below. core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95851 This probably should be moved to before line 285 since validMessages may change before we do the append. core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95850 Would it be simpler to just do the following? if (segment.size + messageSize (long) config.segmentSize) core/src/test/scala/unit/kafka/log/LogTest.scala https://reviews.apache.org/r/26346/#comment95849 Appending 2GB of data in a unit test is probably too long. We can probably just manually validate the fix and skip the unit test. - Jun Rao On Oct. 5, 2014, 3:17 a.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 5, 2014, 3:17 a.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55455 --- Can you try to add a unit test for this? core/src/main/scala/kafka/log/Log.scala https://reviews.apache.org/r/26346/#comment95822 Could you please change the API docs to reflect this change? - Neha Narkhede On Oct. 4, 2014, 10:28 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 4, 2014, 10:28 p.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 5, 2014, 3:17 a.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs (updated) - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani
Re: Review Request 26346: Patch for KAFKA-1670
On Oct. 4, 2014, 11:28 p.m., Neha Narkhede wrote: Can you try to add a unit test for this? I added unit test under LogTest. This test very basic its just adds messages to file until the file reaches Int.Max. This will result in couple of issues first being atleast having 2G+ tmp space and also adds atleast 1 min to test execution latency. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/#review55455 --- On Oct. 5, 2014, 3:17 a.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26346/ --- (Updated Oct. 5, 2014, 3:17 a.m.) Review request for kafka. Bugs: KAFKA-1670 https://issues.apache.org/jira/browse/KAFKA-1670 Repository: kafka Description --- KAFKA-1670. Corrupt log files for segment.bytes values close to Int.MaxInt. Diffs - core/src/main/scala/kafka/log/Log.scala 0ddf97bd30311b6039e19abade41d2fbbad2f59b core/src/test/scala/unit/kafka/log/LogTest.scala 577d102fc2eb6bb1a72326141ecd431db6d66f04 Diff: https://reviews.apache.org/r/26346/diff/ Testing --- Thanks, Sriharsha Chintalapani