I have a Kafka client written in Java running in Kubernetes, and Kafka running 
in Kubernetes.

When the client is running but no Kafka nodes are running it appears from the 
exception below that the DNS lookup fails, then something catches the 
exception, logs it, and reties. Apparently without returning or throwing from 
poll().

This would all be fair enough ... except that the retry happens every few 
milliseconds, causing a large stack trace to be logged every few milliseconds, 
which, if this keeps going for a few days, eats up an awful lot of space in the 
cloud logging system. And it *can* keep happening for days or weeks in a 
development environment, because a developer working on another part of the 
system may not care, or even know, that this part is broken.

What can I do to reduce the volume of logging data? Some combination of 
interventions that could


  *   Retry less quickly than every few milliseconds
  *   Retry a finite number of times before giving up altogether
  *   Cause poll() to throw rather than retry
  *   Not include the stack trace in the log messages

might be helpful. The general approach to K8s applications seems to be that if 
a dependency doesn't exist the client application should simply crash out, so 
that Kubernetes' backoff and retry mechanism will do what's wanted, in which 
case some way of getting poll() to throw rather than swallow this exception 
might be the answer?

Error connecting to node 
confluent-0.confluent.mynamespace.svc.cluster.local:9091 (id: 0 rack: null)
java.io.IOException: Can't resolve address: 
confluent-0.confluent.mynamespace.svc.cluster.local:9091
              at 
org.apache.kafka.common.network.Selector.doConnect(Selector.java:235) 
~[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.common.network.Selector.connect(Selector.java:214) 
~[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:864) 
[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.NetworkClient.access$700(NetworkClient.java:64) 
[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1035)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:920)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:508) 
[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218)
 [kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) 
[kafka-clients-2.0.0.jar:?]
              at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) 
[kafka-clients-2.0.0.jar:?]
              at 
com.origamienergy.etpu.nodes.md.MetrologyWriteWorker.run(MetrologyWriteWorker.java:51)
 [tiger-v2.5-0-g0fd0b8f.jar:?]
              at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_212]
              at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_212]
              at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.nio.channels.UnresolvedAddressException
              at sun.nio.ch.Net.checkAddress(Net.java:101) ~[?:1.8.0_212]
              at 
sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) ~[?:1.8.0_212]
              at 
org.apache.kafka.common.network.Selector.doConnect(Selector.java:233) 
~[kafka-clients-2.0.0.jar:?]
              ... 19 more

Tim Ward

This email is from Origami Energy Limited. The contents of this email and any 
attachment are confidential to the intended recipient(s). If you are not an 
intended recipient: (i) do not use, disclose, distribute, copy or publish this 
email or its contents; (ii) please contact Origami Energy Limited immediately; 
and then (iii) delete this email. For more information, our privacy policy is 
available here: https://origamienergy.com/privacy-policy/. Origami Energy 
Limited (company number 8619644) is a company registered in England with its 
registered office at Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.

Reply via email to