Thanks.
I read a bit of we can read the topic config with the KafkaConsumer, I
could not find any method that allows that. Also we have to check the
global defaults first and then per topic override (as KafkaIO can
subscript to multiple topics).
I did a quick scan of the Kafka command line code [1] and it looks like
all this is done via Zookeeper. As Zookeeper isn't supported at the
moment as far as I can see as a config base for KafkaIO it's probably
not worth implementing something around it.
[1]
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ConfigCommand.scala#L101
On 06/28/2017 9:54 PM, Raghu Angadi wrote:
Fixing it in https://github.com/apache/beam/pull/3461.
Thanks for reporting the issue.
On Wed, Jun 28, 2017 at 8:37 AM, Raghu Angadi <[email protected]> wrote:
Hi Elmar,
You are right. We should not log this at all when the gaps are expected as
you pointed out. I don't think client can check if compaction is enabled
for a topic through Consumer api.
I think we should remove the log. The user can't really act on it other
than reporting it. I will send a PR.
As a temporary work around you can disable logging for a particular class
on the worker with --workerLogLevelOverrides
<https://cloud.google.com/dataflow/pipelines/logging> option. But this
this would suppress rest of the logging the reader.
Raghu
On Wed, Jun 28, 2017 at 4:12 AM, Elmar Weber <[email protected]> wrote:
Hello,
I'm testing the KafkaIO with Google Cloud dataflow and getting warnings
when working with compacted logs. In the code there is a relevant check:
https://github.com/apache/beam/blob/master/sdks/java/io/kafk
a/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1158
// sanity check
if (offset != expected) {
LOG.warn("{}: gap in offsets for {} at {}. {} records missing.",
this, pState.topicPartition, expected, offset - expected);
}
From what I understand, this can happen when log compaction is enabled
because the relevant entry can get cleaned up by Kafka with a newer one.
In this case, shouldn't this be a info log and / or warn only when log
compaction is disabled for the topic?
I'm still debugging some stuff because the pipeline also stops reading on
compacted logs, I'm not sure if this related / could also be an issue with
my Kafka test installation, but as far as I understand the gaps are
expected behaviour with log compaction enabled.
Thanks,
Elmar