wj1918 opened a new pull request #8727: URL: https://github.com/apache/kafka/pull/8727
Bug fix only, [JIRA]( https://issues.apache.org/jira/browse/KAFKA-8120) 1. The Outer loop Issue, at Original line 134 of FileStreamSourceTask.java while (readerCopy.ready()) { Since the buffer is used cross multiple poll. This condition missed a case that file has reached EOF, but buffer has un-parsed data. To fix it, change the while loop to if statement, and move the file reading logic to extractLine if (offset < buffer.length && reader.ready()) { Test case: FileStreamSourceTaskTest.testSmallFile 2. The Expanding Buffer Issue, at Original line 140 of FileStreamSourceTask.java if (offset == buffer.length) { char[] newbuf = new char[buffer.length * 2]; System.arraycopy(buffer, 0, newbuf, 0, buffer.length); buffer = newbuf; } For large file, this condition will be always true even expanding buffer is not needed, so every read will trigger expand buffer which will causes Java heap error or NegativeArraySizeException if buffer.length * 2 overflows. To fix it, move the expanding buffer logic to the end of extractLine if (offset == buffer.length && buffer.length < maxBufferSize) { needExtendBuffer = true; } Test case: FileStreamSourceTaskTest.testLargeFile *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org