Publisher/Subscriber systems can be divided into two categories. 1) Topic based model 2) Content based model - Provide accurate results compared to topic based model, since subscribers interested on the content of the message rather than subscribing to a topic and getting all the messages.
Kafka is a topic based subscription model. So I thought of enhancing the Kafka framework to Content based model. For this enhancement purposes, I have come up with two ideas. First one is without modifying Kafka, enable a separate layer in between Kafka broker and subscriber and which can be used as filter the messages from the producers according to the interests externally specified by the subscribers. (Like a string match/search) Second one is extract the key words of the messages produced by producers and attach the keyword list as a header to each message and send it. For this we can use any POS taggers. After that, enable the subscribers to enter their interests externally and check their interests matches with the header of the message without analyzing the entire message. If there is a match deliver the corresponding message to subscribers else reject. Usually second method consumes time compared to first one. Any other ideas to perform the above Content based enhancement in efficient way? Or any other optimization that we can inject to the above proposed architectures? Thanks. Regards, Janagan S.