I can tell from the terminology you use that you are familiar with traditional 
message queue products. Kafka is very different. Thats what makes it so 
interesting and revolutionary in my opinion.

Clients do not connect to topics because kafka is a distributed and clustered 
system where topics are sharded into pieces called partitions and the topic 
partitions are spread out across all the kafka brokers in the cluster (and also 
replicated several more times across the cluster for fault tolerance). When a 
client logically connects to a topic, its actually making many connections to 
many nodes in the kafka cluster which enables both parallel processing and 
fault tolerance.

Also when a client consumes a message, the message is not removed from a queue, 
it remains in kafka for many days (sometimes months or years). It is not “taken 
off the queue” it is rather “copied from the commit log”. It can be consumed 
again and again if needed because it is an immutable record of an event that 
happened.

Now getting back to your question of how to see where messages get consumed 
(copied). The reality is that they go many places and can be consumed many 
times. This makes tracing and tracking message delivery more difficult but not 
impossible. There are many tools both open source and commercial that can track 
data from producer to kafka (with replication) to multiple consumers. They 
typically involve taking telemetry from both clients (producers and consumers) 
and brokers (all of them as they act as a cluster) and aggregate all the data 
to see the full flow of messages in the system. Thats why the logs may seem 
overwelming and you need to look at the logs of all the broker (and perhaps all 
the clients as well) to get the full picture.

-hans 

> On Mar 28, 2020, at 4:50 PM, Colin Ross <rossi...@gmail.com> wrote:
> 
> Hi All - just started to use Kafka. Just one thing driving me nuts. I want
> to get logs of each time a publisher or subscriber connects. I am trying to
> just get the IP that they connected from and the topic to which they
> connected. I have managed to do this through enabling debug in the
> kafka-authorizer, however, the number of logs are overwhelming as is the
> update rate (looks like 2 per second per client).
> 
> What I am actually trying to achieve is to understand where messages go, so
> I would be more than happy to just see notifications when messages are
> actually sent and actually taken off the queue.
> 
> Is there a more efficient way of achieving my goal than turning on debug?
> 
> Cheers
> Rossi

Reply via email to