[
https://issues.apache.org/jira/browse/KAFKA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333387#comment-14333387
]
Joel Koshy commented on KAFKA-1755:
-----------------------------------
I thought a bit more about this and here is a patch that summarizes my thoughts.
This patch does message validation on arrival, and drops unkeyed messages
during log compaction.
I actually think it is better to reject invalid messages (unkeyed and for now
compressed) up front as opposed to accepting those messages and only
dropping/warning during compaction. This way the producer is given early
indication via a client-side error that it is doing something wrong which is
better than just a broker-side warning/invalid metric. We still need to deal
with unkeyed messages that may already be in the log but that is orthogonal I
think - this includes the case when you change a non-compacted topic to be
compacted. That is perhaps an invalid operation - i.e., you should ideally
delete the topic before doing that, but in any event this patch handles that
case by deleting invalid messages during log compaction.
Case in point: at LinkedIn we use Kafka-based offset management for some of our
consumers. We recently discovered compressed messages in the offsets topic
which caused the log cleaner to quit. We saw this issue in the past with Samza
checkpoint topics and suspected that Samza was doing something wrong. However,
after seeing it in the __consumer_offsets topic it is more likely to be an
actual bug in the broker - either in the log cleaner itself, or even at the
lower level byte-buffer message set API level. We currently do not know. If we
at least reject invalid messages on arrival we can rule out clients as being
the issue.
> Improve error handling in log cleaner
> -------------------------------------
>
> Key: KAFKA-1755
> URL: https://issues.apache.org/jira/browse/KAFKA-1755
> Project: Kafka
> Issue Type: Bug
> Reporter: Joel Koshy
> Assignee: Joel Koshy
> Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1755.patch
>
>
> The log cleaner is a critical process when using compacted topics.
> However, if there is any error in any topic (notably if a key is missing)
> then the cleaner exits and all other compacted topics will also be adversely
> affected - i.e., compaction stops across the board.
> This can be improved by just aborting compaction for a topic on any error and
> keep the thread from exiting.
> Another improvement would be to reject messages without keys that are sent to
> compacted topics although this is not enough by itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)