Re: NiFi cluster goes 100% CPU in no time

Shanker Sneh Mon, 10 Jun 2019 05:45:20 -0700

Thanks Joe for reading through and helping me. :)


   - NiFi hasn't been upgraded. its 1.8.0 (community version of Horton
   works data flow).
   - OS/Kernel is the same. Just that I have added more capacity to disk
   (with better IO).
   - JVM continues to be the same. Java 8.
   - When CPU is 100%, top shoes just NiFi java process. When I provided
   with more cores (as high as 16), NiFi used all 16 nodes and throttled at
   1600%.


Meanwhile, I am trying to clear up all FlowFiles from disk and start the
flows afresh.


On Mon, Jun 10, 2019 at 5:42 PM Joe Witt <joe.w...@gmail.com> wrote:

> Sneh
>
> It was stable for months but now is high...
>
> has nifi been upgraded?  what version before vs now?
>
> has the os/kernel been changed?
>
> has the jvm been updated?
>
> when cpu is 100 what does top show?
>
> thanks
>
> On Mon, Jun 10, 2019, 7:59 AM Shanker Sneh <shanker.s...@zoomcar.com>
> wrote:
>
>> Thanks for the suggestions Joe.
>> Actually the issue is persistent even after reverting to the
>> 'older-regular-incremental-load' of the data flow* (which used to work
>> fine since months on similarly-configured hardware a few days back by
>> utilising just ~50% of resources)*.
>>
>> These days, one of the 2-node cluster gets out of NiFi every now and then
>> as the CPU peaks 100% for that particular machine. And subsequently the
>> other node reaches 100% CPU too.
>> When I restart NiFi on a particular node, CPU tanks to 0 and then spikes
>> to 100% within few minutes - the data flowing through the pipeline is *just
>> too less* to throttle my CPU ideally.
>>
>> The machine config and NiFi config remains untouched - this has left me
>> confused where the problem might be. Something which had been running
>> smoothly since months, has become a challenge now.
>>
>> On Fri, Jun 7, 2019 at 8:16 PM Joe Witt <joe.w...@gmail.com> wrote:
>>
>>> Shanker
>>>
>>> It sounds like you've gone through some changes in general and have
>>> worked through those.  Now you have a flow running with a high volume of
>>> data (history load) and want to know which parts of the flow are most
>>> expensive/consuming the CPU.
>>>
>>> You should be able to look at the statistics provided on the processors
>>> to see where the majority of CPU time is spent.  You can usually very
>>> easily reason over this if it is doing compression/encryption/etc.. and
>>> determine if you want to give it more threads/less threads/batch data
>>> together better, etc..
>>>
>>> The configuration of the VMs, the NiFi instance itself, the flow, and
>>> the nature of the data are all important to see/understand to be of much
>>> help here.
>>>
>>> THanks
>>>
>>> On Fri, Jun 7, 2019 at 7:07 AM Shanker Sneh <shanker.s...@zoomcar.com>
>>> wrote:
>>>
>>>> Hello all,
>>>>
>>>> I am facing strange issue with NiFi 1.8.0 (2 nodes)
>>>> My flows had been running fine since months.
>>>>
>>>> Yesterday I had to do some history load which filled up my both disks
>>>> (I have FlowFile repository as separate disk).
>>>>
>>>> I increased the size of the root & flowflile disk both. And 'grow' the
>>>> disk partition and 'extended' the file system (it's an EC2 linux).
>>>> But post that my CPU has been spiking to complete 100% - even at
>>>> regular load (earlier it used to be somewhere around 50%)
>>>> Also I did no change to the config values or thread count etc.
>>>>
>>>> I upgraded the 2 nodes to see if that solves the problem - from 16 Gb
>>>> box (4 core) to 64 Gb (16 core).
>>>> But even the larger box is throttling on the CPU at 100%.
>>>>
>>>> I tried clearing all repositories and restarted NiFi application and
>>>> the EC2 - but no improvement.
>>>>
>>>> Kindly point me in the right direction. I am unable to pinpoint
>>>> anything.
>>>>
>>>> --
>>>> Best,
>>>> Sneh
>>>>
>>>
>>
>> --
>> Best,
>> Sneh
>>
>

-- 
Best,
Sneh

Re: NiFi cluster goes 100% CPU in no time

Reply via email to