Mark,

Yes, it was the flowfile repository.

Of all your points, the large attributes is most likely our issue.  One of
our folks was caching the flowfile (which can be large occasionally) in an
attribute ahead of a DB lookup (which would overwrite the contents) and
then reinstating the content after merging with the DB lookup.

The attribute was not removed after the merge. We have added a couple of
items to remove the attribute this morning, but the mere presence of it
briefly may be enough to cause the spikes.

I have since attached a very large disk and I can see the
occasionally spikes:

[image: image.png]
At 22% on a 512G disk, that is over 110G.  What isn't clear is why it is
not consistently spiking.

We have made some changes to the how long the attribute lives and will
monitor over the next couple of days, but likely we will need to cache the
contents somewhere and retrieve them later unless someone knows of a better
solution here.

Thanks for the guidance

Dave


On Fri, Mar 8, 2024 at 7:05 AM Mark Payne <marka...@hotmail.com> wrote:

> Dave,
>
> When you say that the journal files are huge, I presume you mean the
> FlowFile repository?
>
> There are generally 4 things that can cause this:
> - OutOfMemoryError causing the FlowFile repo not to properly checkpoint
> - Out of Disk Space causing the FlowFile repo not to properly checkpoint
> - Out of open file handles causing the FlowFile repo not to properly
> checkpoint
> - Creating a lot of huge attributes on your FlowFiles.
>
> The first 3 situations can be identified by looking for errors in the logs.
> For the third one, you need to understand whether or not you’re creating
> huge FlowFile attributes. Generally, attributes should be very small -
> 100-200 characters or less, ideally. It’s possible that you have a flow
> that creates huge attributes but the flow is only running on the Primary
> Node, and Node 2 is your Primary Node, which would cause this to occur only
> on this node.
>
> Thanks
> -Mark
>
>
> > On Mar 7, 2024, at 9:24 PM, David Early via users <users@nifi.apache.org>
> wrote:
> >
> > I have a massive issue: I have a 2 node cluster (using 5 external
> zookeepers on other boxes), and for some reason on node 2 I have MASSIVE
> journal files.
> >
> > I am round robbining data between the nodes, but for some reason node 2
> just fills up.  This is the second time this has happened this week.
> >
> > What should I do?  nifi.properties are the same on both systems (except
> for local host names)..
> >
> > Any ideas of what might be causing one node to overload?
> >
> > Dave
> >
> >
>
>

-- 
David Early, Ph.D.
david.ea...@grokstream.com
720-470-7460 Cell

Reply via email to