I can tell from the terminology you use that you are familiar with traditional message queue products. Kafka is very different. Thats what makes it so interesting and revolutionary in my opinion.
Clients do not connect to topics because kafka is a distributed and clustered system where topics are sharded into pieces called partitions and the topic partitions are spread out across all the kafka brokers in the cluster (and also replicated several more times across the cluster for fault tolerance). When a client logically connects to a topic, its actually making many connections to many nodes in the kafka cluster which enables both parallel processing and fault tolerance. Also when a client consumes a message, the message is not removed from a queue, it remains in kafka for many days (sometimes months or years). It is not “taken off the queue” it is rather “copied from the commit log”. It can be consumed again and again if needed because it is an immutable record of an event that happened. Now getting back to your question of how to see where messages get consumed (copied). The reality is that they go many places and can be consumed many times. This makes tracing and tracking message delivery more difficult but not impossible. There are many tools both open source and commercial that can track data from producer to kafka (with replication) to multiple consumers. They typically involve taking telemetry from both clients (producers and consumers) and brokers (all of them as they act as a cluster) and aggregate all the data to see the full flow of messages in the system. Thats why the logs may seem overwelming and you need to look at the logs of all the broker (and perhaps all the clients as well) to get the full picture. -hans > On Mar 28, 2020, at 4:50 PM, Colin Ross <rossi...@gmail.com> wrote: > > Hi All - just started to use Kafka. Just one thing driving me nuts. I want > to get logs of each time a publisher or subscriber connects. I am trying to > just get the IP that they connected from and the topic to which they > connected. I have managed to do this through enabling debug in the > kafka-authorizer, however, the number of logs are overwhelming as is the > update rate (looks like 2 per second per client). > > What I am actually trying to achieve is to understand where messages go, so > I would be more than happy to just see notifications when messages are > actually sent and actually taken off the queue. > > Is there a more efficient way of achieving my goal than turning on debug? > > Cheers > Rossi