Thanks Raghu.Could triggering this message be because I send data with random
delays in between to Kafka?In other words, if I send burst (no delays in
between sending) then this message should not come up. Right?I rather keep
Debug on just in case I notice something unusual like this and can ask the
forum whsta going on.
On another note, all nodes report this message in their *.log implying the
4-nodes cluster servers are all processing the Kafka incoming data.But, I get
System.out logs (*.out) in only one server while they all generate *.log files
(JM, TMs) in all nodes.Its just this the *.out that gets generated in only one
node.Is this the default action? Anywhere I can config so I get corresponding
*.out logs in all nodes rather than just one node?I understand this maybe a
Flink Cluster related question, but I thought I give it a shot :))Thanks very
much Raghu.CheersAmir-
From: Raghu Angadi <[email protected]>
To: [email protected]; amir bahmanyari <[email protected]>
Sent: Thursday, September 8, 2016 3:46 PM
Subject: Re: KafkaIO() "no records have been read yet." Warning
Did you enable debug logging? You can disable debug logging.
What it implies is that this KafkaIO reader hasn't read any records yet
(because your topics are idle) and does not know what to return for watermark.
In this case Flink runner is asking the reader for watermark.
We want to improve KafkaIO to handle default watermark better when it is caught
up. In that case this logging would not be triggered.
On Thu, Sep 8, 2016 at 3:31 PM, amir bahmanyari <[email protected]> wrote:
This is where its happening in KafkaIO() class in beam.sdk source code .
| @Override |
| | public Instant getWatermark() { |
| | if (curRecord == null) { |
| | LOG.debug("{}: getWatermark() : no records have been read yet.", name); |
| | return initialWatermark; |
| | } |
| | |
| | return source.watermarkFn.isPresent() |
| | ? source.watermarkFn.get().apply (curRecord) : curTimestamp; |
| | } |
| | |
Its checking curRecord == null...Does this mean Kafka is shooting blank? A
neglected hiccup checking in KafkaIO & Kafka sender?Can someone respond pls? I
am afraid I am losing tuples & thats critical to my whole work I am doing on
bench-marking Beam in a Flink Cluster..
Thanks+regardsAmir- From: amir bahmanyari <[email protected]>
To: "[email protected]. org" <[email protected]. org>
Sent: Thursday, September 8, 2016 2:56 PM
Subject: KafkaIO() "no records have been read yet." Warning
Hi Colleagues,I am running a Beam app on a 4-nodes Flink Cluster while
receiving data from a single Kafka server.In all nodes
flink-abahman-taskmanager-0- beam2.log file, I see continuesly see this Warning
message as data is processed:2016-09-08 21:52:26,523 WARN
org.apache.beam.sdk.io.kafka. KafkaIO - Reader-19:
getWatermark() : no records have been read yet.
What is this? Does it mean maybe losing tuples? How can I enhance the KafkaIO()
call to not seeing this Warning?Thanks+regardsAmir-