Hey Guys,

Just a friendly (and hopefully not too annoying) reminder about code
layering. We have a couple of different systems that need to remain
independent and not know about each other. Some examples of this are the
network layer, the log system, the replication system, the controller code,
etc. Various pieces of API-specific logic sit on top of these layers and
integrate them into the functionality the system provides. The goal is to
make these lower layers have the cleanest, most logical, and most general
possible APIs that wrap up all the complicated stuff that they do.

An easy way to tell if we are doing the right thing is to think about how
well the lowest levels of code makes sense if picked up and used on its own
without the rest of kafka.

Here is an example where we ended up breaking that layering:
  Log.scala:
    def getOffsetsBefore(request: OffsetRequest): Array[Long]

We clearly didn't get the layering right here either. The method takes a
OffsetRequest object, but the Log shouldn't know anything about the API
layer. The implementation of the method is a bunch of logic very specific
to some of the oddities of this API.

The right way to implement this would be to add a more general API to get
LogSegments from the log. Then in the API layer add all the specific types
of filtering and munging needed to fullfill the contract of the API. What I
have seen is that when we don't focus on this we end up with lots and lots
of odd APIs and it tends to make the code really hard to evolve or to
reason about what it is that the subsystem does.

This is not a major problem, but being a little more rigorous at these
layers helps keep the code beautiful. :-)

Cheers,

-Jay

Reply via email to