Re: [EXTERNAL] Re: ConsumeKafka_2_6 Processor issue

Joe Witt Mon, 19 Aug 2024 12:01:59 -0700

Glad to hear it

On Mon, Aug 19, 2024 at 12:24 PM Chirthani, Deepak Reddy <
[email protected]> wrote:


> This issue has been resolved. The Kafka consumer was working just fine.
> Nifi was using an updated MongoURI to load messages which the frontend tool
> used by customers to query mongodb was not. Thank you, Joe, for helping me.
> Appreciate it.
>
>
>
> *From:* Joe Witt <[email protected]>
> *Sent:* Thursday, August 15, 2024 11:55 AM
> *To:* [email protected]
> *Subject:* Re: [EXTERNAL] Re: ConsumeKafka_2_6 Processor issue
>
>
>
> CAUTION: The e-mail below is from an external source. Please exercise
> caution before opening attachments, clicking links, or following guidance.
>
>
>
> I suspect error conditions hit it one case and not the other.
>
>
>
> There are a ton of variables to consider. But one you have total control
> over is the flow design and the evaluation method.
>
>
>
> Thanks
>
>
>
> On Thu, Aug 15, 2024 at 9:49 AM Chirthani, Deepak Reddy <
> [email protected]> wrote:
>
> Joe,
>
>
>
> Ok, understood. Since you are confident that the messages are being 100%
> consumed in both the cases(original flow and duplicate flow), how can
> messages ends up not reaching to the Putmongorecord processor in the
> original dataflow if the logic the flow is same in both the dataflows(Same
> processors and same connections)?
>
> *From:* Joe Witt <[email protected]>
> *Sent:* Thursday, August 15, 2024 11:29 AM
> *To:* [email protected]
> *Subject:* Re: [EXTERNAL] Re: ConsumeKafka_2_6 Processor issue
>
>
>
> CAUTION: The e-mail below is from an external source. Please exercise
> caution before opening attachments, clicking links, or following guidance.
>
>
>
> Got it - yep focusing on the flow design itself is where I'd go for now.
> Consider every config that could in the case of errors/unexpected data
> allow things to leave the flow.  Using provenance data can be valuable to
> helping review this.  Add things into the flow that extract metadata that
> provenance can index and then when you find a missing event you could
> likely search it.  There are a lot of techniques to help hunt such things
> down but in the vast majority of cases it is a relationship routing to
> terminate or something other than ensuring it ends up going to the
> destination.
>
>
>
> Thanks
>
>
>
> On Thu, Aug 15, 2024 at 9:20 AM Chirthani, Deepak Reddy <
> [email protected]> wrote:
>
> Hi Joe,
>
>
>
> Both the original and the duplicate flows are same except for two changes
> which are kakfa group id and the target mongo collection name. Each
> processor in the dataflow has all the relationships other than success
> connected to a funnel. Therefore, I should see data if any processor is
> routing to a relationship other than success which I don’t see in both the
> flows.
>
>
>
> I agree with you that its generally unlikely there is data loss between
> kafka and nifi but in this case I can clearly see it by querying both the
> collections with the same eventid/transactionid
>
> *From:* Joe Witt <[email protected]>
> *Sent:* Thursday, August 15, 2024 11:11 AM
> *To:* [email protected]
> *Subject:* [EXTERNAL] Re: ConsumeKafka_2_6 Processor issue
>
>
>
> CAUTION: The e-mail below is from an external source. Please exercise
> caution before opening attachments, clicking links, or following guidance.
>
>
>
> Hello
>
>
>
> The most likely scenario at play here is that configuration of the flow
> results in certain messages/events/flowfiles being routed to a failure path
> or some path that does not end up in Mongo.  It is highly unlikely there is
> loss between Kafka and NiFi and between NiFi and Mongo.  The more likely
> scenario is a configuration within the flow in nifi which directs certain
> data in certain conditions to be thrown out.
>
>
>
> Have you reviewed every possible relationship and how it is handled in the
> flow?
>
>
>
> Thanks
>
>
>
> On Thu, Aug 15, 2024 at 8:56 AM Chirthani, Deepak Reddy <
> [email protected]> wrote:
>
> Hi guys,
>
>
>
> I have a dataflow in a Nifi 3-node clustered environment reading from a
> kafka topic and writing to a mongodb collection *target1*. We are not
> filtering any messages in the dataflow as well. The groupid for this
> consumer is *test1* and this consumer has been active since *two years*
>
> From at least a month, the business customers have been reporting us that
> they are missing data in the target which are they sure to be publishing.
> Even the kafka team helped us searching the message(s) on the kafka brokers
> which the customers claim that they were sure to be publishing. So its
> evident that the consumer is not picking them up.
>
>
> Now, I did set-up a new dataflow in nifi duplicating the original
> dataflow. I made two differences. New group-id *test2* for reading
> messages and new target collection *target2* for writing the data.
> Apparently this duplicate dataflow is consuming the expected number of
> messages.
>
>
>
> Now, I changed the groupid in the original dataflow from *test1 *to 
> *newgrouped
> *and the rest of the consumekafka processor configuration remains same
> including the offset reset which is latest. Both the original and duplicate
> dataflows are running from quite some time but still the issue exists with
> the original dataflow. The duplicate dataflow is keep on doing good
> consuming the expected number of messages, parsing and loading the
> processed data to the target.
>
>
>
> Please advise what could be the issue and how to resolve this.
>
>
>
> Additional note:
>
> Nifi Version: 1.21.0
>
> Number of concurrent threads on the consumekafka processor: 2
>
> Number of kafka partitions on the topic: 5
>
>
>
> Thanks
>
>
>
>
>
> The contents of this e-mail message and any attachments are intended
> solely for the addressee(s) and may contain confidential and/or legally
> privileged information. If you are not the intended recipient of this
> message or if this message has been addressed to you in error, please
> immediately alert the sender by reply e-mail and then delete this message
> and any attachments. If you are not the intended recipient, you are
> notified that any use, dissemination, distribution, copying, or storage of
> this message or any attachment is strictly prohibited.
>
> The contents of this e-mail message and any attachments are intended
> solely for the addressee(s) and may contain confidential and/or legally
> privileged information. If you are not the intended recipient of this
> message or if this message has been addressed to you in error, please
> immediately alert the sender by reply e-mail and then delete this message
> and any attachments. If you are not the intended recipient, you are
> notified that any use, dissemination, distribution, copying, or storage of
> this message or any attachment is strictly prohibited.
>
> The contents of this e-mail message and any attachments are intended
> solely for the addressee(s) and may contain confidential and/or legally
> privileged information. If you are not the intended recipient of this
> message or if this message has been addressed to you in error, please
> immediately alert the sender by reply e-mail and then delete this message
> and any attachments. If you are not the intended recipient, you are
> notified that any use, dissemination, distribution, copying, or storage of
> this message or any attachment is strictly prohibited.
>
> The contents of this e-mail message and any attachments are intended
> solely for the addressee(s) and may contain confidential and/or legally
> privileged information. If you are not the intended recipient of this
> message or if this message has been addressed to you in error, please
> immediately alert the sender by reply e-mail and then delete this message
> and any attachments. If you are not the intended recipient, you are
> notified that any use, dissemination, distribution, copying, or storage of
> this message or any attachment is strictly prohibited.
>

Re: [EXTERNAL] Re: ConsumeKafka_2_6 Processor issue

Reply via email to