Ok what you are describing is different from accidental duplicate message pruning which is what the idempotent publish feature does.
You are describing a situation were multiple independent messages just happen to have the same contents (both key and value). Removing those messages is an application specific function as you can imaging applications which would not want independent but identical messages to be removed (for example temperature sensor readings, heartbeat messages, or other telemetry data that has repeat but independent values). Your best bet is to write a simple intermediate processor that implements your message pruning algorithm of choice and republishes (or not) to another topic that your consumers read from. Its a stateful app because it needs to remember 1 or more past messages but that can be done using the Kafka Streams processor API and the embedded rocksdb state store that comes with Kafka Streams (or as a UDF in KSQL). You can alternatively write your consuming apps to implement similar message pruning functionality themselves and avoid one extra component in the end to end architecture -hans > On Apr 2, 2019, at 7:28 PM, jim.me...@concept-solutions.com > <jim.me...@concept-solutions.com> wrote: > > > >> On 2019/04/02 22:43:31, jim.me...@concept-solutions.com >> <jim.me...@concept-solutions.com> wrote: >> >> >>> On 2019/04/02 22:25:16, jim.me...@concept-solutions.com >>> <jim.me...@concept-solutions.com> wrote: >>> >>> >>>> On 2019/04/02 21:59:21, Hans Jespersen <h...@confluent.io> wrote: >>>> yes. Idempotent publish uses a unique messageID to discard potential >>>> duplicate messages caused by failure conditions when publishing. >>>> >>>> -hans >>>> >>>>> On Apr 1, 2019, at 9:49 PM, jim.me...@concept-solutions.com >>>>> <jim.me...@concept-solutions.com> wrote: >>>>> >>>>> Does Kafka have something that behaves like a unique key so a producer >>>>> can’t write the same value to a topic twice? >>> >>> Hi Hans, >>> >>> Is there some documentation or an example with source code where I can >>> learn more about this feature and how it is implemented? >>> >>> Thanks, >>> Jim >> >> By the way I tried this... >> echo "key1:value1" | ~/kafka/bin/kafka-console-producer.sh --broker-list >> localhost:9092 --topic TestTopic --property "parse.key=true" --property >> "key.separator=:" --property "enable.idempotence=true" > /dev/null >> >> And... that didn't seem to do the trick - after running that command >> multiple times I did receive key1 value1 for as many times as I had run the >> prior command. >> >> Maybe it is the way I am setting the flags... >> Recently I saw that someone did this... >> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test >> --producer-property enable.idempotence=true --request-required-acks -1 > > Also... the reason for my question is that we are going to have two JMS > topics with nearly redundant data in them have the UNION written to Kafka for > further processing. >