Yes, after KAFKA-506 <https://issues.apache.org/jira/browse/KAFKA-506> I would like to do a little additional work on Log and LogManager. This could be post-0.8. The todo list is to support multiple data directories, get rid of the custom copy-on-write SegmentList and replace it with an off-the-shelf ConcurrentSkipListMap, and refactor a few APIs to be nicer.
Filed KAFKA-521 (https://issues.apache.org/jira/browse/KAFKA-521) to cover that last one. -Jay On Tue, Sep 18, 2012 at 10:22 AM, Neha Narkhede <neha.narkh...@gmail.com>wrote: > +1 > > On Tue, Sep 18, 2012 at 10:17 AM, Jun Rao <jun...@gmail.com> wrote: > > Makes sense. Could you file a jira to track this? > > > > Thanks, > > > > Jun > > > > On Tue, Sep 18, 2012 at 9:22 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > >> Hey Guys, > >> > >> Just a friendly (and hopefully not too annoying) reminder about code > >> layering. We have a couple of different systems that need to remain > >> independent and not know about each other. Some examples of this are the > >> network layer, the log system, the replication system, the controller > code, > >> etc. Various pieces of API-specific logic sit on top of these layers and > >> integrate them into the functionality the system provides. The goal is > to > >> make these lower layers have the cleanest, most logical, and most > general > >> possible APIs that wrap up all the complicated stuff that they do. > >> > >> An easy way to tell if we are doing the right thing is to think about > how > >> well the lowest levels of code makes sense if picked up and used on its > own > >> without the rest of kafka. > >> > >> Here is an example where we ended up breaking that layering: > >> Log.scala: > >> def getOffsetsBefore(request: OffsetRequest): Array[Long] > >> > >> We clearly didn't get the layering right here either. The method takes a > >> OffsetRequest object, but the Log shouldn't know anything about the API > >> layer. The implementation of the method is a bunch of logic very > specific > >> to some of the oddities of this API. > >> > >> The right way to implement this would be to add a more general API to > get > >> LogSegments from the log. Then in the API layer add all the specific > types > >> of filtering and munging needed to fullfill the contract of the API. > What I > >> have seen is that when we don't focus on this we end up with lots and > lots > >> of odd APIs and it tends to make the code really hard to evolve or to > >> reason about what it is that the subsystem does. > >> > >> This is not a major problem, but being a little more rigorous at these > >> layers helps keep the code beautiful. :-) > >> > >> Cheers, > >> > >> -Jay > >> >