[ https://issues.apache.org/jira/browse/KAFKA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504871#comment-13504871 ]
Swapnil Ghike commented on KAFKA-521: ------------------------------------- 1. I see, thanks for the clarification. If there are multiple compression codecs in the same set, would it make sense to have a precedence order among them to decide which compression codec is used for compressing all the messages together? Right now it seems that the codec of the last compressed message will win. 2. If you are using IntelliJ, you can right click on the file name in the project structure and click on "Optimize Imports". The unused imports that I see are Log: import kafka.api.OffsetRequest import java.util.{Comparator, Collections, ArrayList} import scala.math._ import kafka.server.BrokerTopicStat LogManager: import kafka.log.Log._ 3. Sure, we can talk. 4. Yes, that was a good catch. It's also less prone to introducing new bugs this way. I am not super confident about my understanding of the non-Log* part of this patch, so it will be good if someone else could also review that part. > Refactor Log subsystem > ---------------------- > > Key: KAFKA-521 > URL: https://issues.apache.org/jira/browse/KAFKA-521 > Project: Kafka > Issue Type: Improvement > Reporter: Jay Kreps > Attachments: KAFKA-521-v1.patch, KAFKA-521-v2.patch, > KAFKA-521-v3.patch > > > There are a number of items it would be nice to cleanup in the log subsystem: > 1. Misc. funky apis in Log and LogManager > 2. Much of the functionality in Log should move into LogSegment along with > corresponding tests > 3. We should remove SegmentList and instead use a ConcurrentSkipListMap > The general idea of the refactoring fall into two categories. First, improve > and thoroughly document the public APIs. Second, have a clear delineation of > responsibility between the various layers: > 1. LogManager is responsible for the creation and deletion of logs as well as > the retention of data in log segments. LogManager is the only layer aware of > partitions and topics. LogManager consists of a bunch of individual Log > instances and interacts with them only through their public API (mostly true > today). > 2. Log represents a totally ordered log. Log is responsible for reading, > appending, and truncating the log. A log consists of a bunch of LogSegments. > Currently much of the functionality in Log should move into LogSegment with > Log interacting only through the Log interface. Currently we reach around > this a lot to call into FileMessageSet and OffsetIndex. > 3. A LogSegment consists of an OffsetIndex and a FileMessageSet. It supports > largely the same APIs as Log, but now localized to a single segment. > This cleanup will simplify testing and debugging because it will make the > responsibilities and guarantees at each layer more clear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira