Thank you Steve.  The Group ID on all the consumers has been set to the same ID.  I've since upgraded to 1.18 from 1.16.3 and have not re-run the test when running on our 3 node cluster to see if we're getting duplicate messages.  With 1.16.3, I was seeing duplicate messages when running the consumers on all nodes.  When switching to just the primary, I no longer saw duplicates.

-joe

On 11/21/2022 4:18 AM, stephen.hindmarch.bt.com via users wrote:

Joe,

I would ask if you are using a consumer group ID? A consumer group allows all consumers for the same application to know about each other’s activity - which topics, partitions and offsets have most recently been consumed by which consumer - and so avoid conflicts and duplication. If you are not using a consumer group then having all nodes in a cluster each consume the same data is in fact correct behaviour from a Kafka viewpoint.

*Steve Hindmarch*

*From:*Joe Obernberger <joseph.obernber...@gmail.com>
*Sent:* 19 November 2022 18:33
*To:* users@nifi.apache.org; Aian Cantabrana <acantabr...@zylk.net>; Joe Witt <joe.w...@gmail.com>
*Subject:* Re: Exacly once from NiFi to Kafka

Are you by chance using a clustered NiFi?  I'm seeing duplicate messages if I run the consumer on multiple NiFi nodes, so I've started running the consumer only on the parent.  This seems to correct the issue, but leads to other problems.  I'd love a solution.

-Joe

On 11/16/2022 3:50 AM, Aian Cantabrana wrote:

    Hi Joe,

    Thanks for the reply. The actual flow is sending data from the
    ConsumeAMQP processor to two different PublishKafka processors,
    one with Idempotence and other with default config. Each of it is
    sending same data to two different topics and comparing both
    topics is how I am checking that there are duplicates. It seems to
    be random, some times they appear in the "normal" processor's
    topic and others in the "idempotence", I did not find any pattern.

    I will upgrade to NiFi 1.18.0 and try again.

    In any case, messages have json format (one json per flowfile) but
    since I am sending and storing them in kafka in plain text I am
    using /no-record-oriented/ Kafka publisher. Is PublishKafkaRecord
    more reliable? Would it be better to use it?

    Thanks,

    Aian

    ------------------------------------------------------------------------

    *De: *"Joe Witt" <joe.w...@gmail.com> <mailto:joe.w...@gmail.com>
    *Para: *"users" <users@nifi.apache.org> <mailto:users@nifi.apache.org>
    *Enviados: *Martes, 15 de Noviembre 2022 17:31:54
    *Asunto: *Re: Exacly once from NiFi to Kafka

    Aian,

    How can you tell there are duplicates in Kafka and are you certain
    that no duplicates exist in the source topic?

    Given NiFi's data provenance capabilities you should be able to
    pin point a given duplicate and figure out whether it happened at
    the source, in nifi, or otherwise.

    Note much has changed/improved since the 1.12.x line of NiFi so we
    have more Kafka components and record oriented mechanisms.  But
    still pretty sure even in your version we should not be
    duplicating data unless the flow is configured such that it would
    happen.

    Thanks

    On Tue, Nov 15, 2022 at 9:25 AM Aian Cantabrana
    <acantabr...@zylk.net> wrote:

        Hi,

        I am having some difficulties trying to get /*exactly-once
        */semantic while ensuring data order from NiFi to Kafka. I
        have read Kafka documentation and it should be quite straight
        forward using idempotent producer from NiFi and having a Kafka
        topic with a single partition, but I am still getting some
        duplicated messages in Kafka.

        NiFi version: 1.12.1

        Kafka version: 2.7.0

        NiFi flow:

        (Both shown queues with FIFO prioritizer)

        PublishKafka_2_6 configuration:

        As I said, target Kafka topic has just one partition to ensure
        data order.

        Incoming flowfiles are small 60 bytes messages.

        I have been a while working with it so any suggestion is
        really welcome.

        Thanks in advance,

        Aian

<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=05%7C01%7Cstephen.hindmarch%40bt.com%7C83952fd1f34343749d4708daca5c8bd0%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C638044796103257137%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qZuFICQnDLwpIzUcrrCRL%2BIu5%2Fwsr6Y6qdP21n71QvU%3D&reserved=0>

        

Virus-free.www.avg.com <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=05%7C01%7Cstephen.hindmarch%40bt.com%7C83952fd1f34343749d4708daca5c8bd0%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C638044796103257137%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qZuFICQnDLwpIzUcrrCRL%2BIu5%2Fwsr6Y6qdP21n71QvU%3D&reserved=0>


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

Reply via email to