Did you enable debug logging? You can disable debug logging. What it implies is that this KafkaIO reader hasn't read any records yet (because your topics are idle) and does not know what to return for watermark. In this case Flink runner is asking the reader for watermark.
We want to improve KafkaIO to handle default watermark better when it is caught up. In that case this logging would not be triggered. On Thu, Sep 8, 2016 at 3:31 PM, amir bahmanyari <[email protected]> wrote: > This is where its happening in KafkaIO() class in beam.sdk source code . > @Override > public Instant getWatermark() { > if (*curRecord == **nul*l) { > LOG.debug("{}: getWatermark() : *no records have been read yet."**, name* > ); > return initialWatermark; > } > return source.watermarkFn.isPresent() > ? source.watermarkFn.get().apply(curRecord) : curTimestamp; > } > > Its checking curRecord == null...Does this mean Kafka is shooting blank? A > neglected hiccup checking in KafkaIO & Kafka sender? > Can someone respond pls? > I am afraid I am losing tuples & thats critical to my whole work I am > doing on bench-marking Beam in a Flink Cluster.. > Thanks+regards > Amir- > ------------------------------ > *From:* amir bahmanyari <[email protected]> > *To:* "[email protected]" <[email protected]> > *Sent:* Thursday, September 8, 2016 2:56 PM > *Subject:* KafkaIO() "no records have been read yet." Warning > > Hi Colleagues, > I am running a Beam app on a 4-nodes Flink Cluster while receiving data > from a single Kafka server. > In all nodes flink-abahman-taskmanager-0-beam2.log file, I see > continuesly see this Warning message as data is processed: > 2016-09-08 21:52:26,523 WARN org.apache.beam.sdk.io.kafka.KafkaIO > - Reader-19: getWatermark() : no records have been read > yet. > > What is this? Does it mean maybe losing tuples? > How can I enhance the KafkaIO() call to not seeing this Warning? > Thanks+regards > Amir- > > >
