Brandon DeVries created NIFI-3077:
-------------------------------------

             Summary: wali partitions can grow unevenly, cause issues
                 Key: NIFI-3077
                 URL: https://issues.apache.org/jira/browse/NIFI-3077
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
    Affects Versions: 0.7.1, 1.0.0
            Reporter: Brandon DeVries
            Priority: Minor


In the past we have observed instances where one partition in a FlowFile repo 
can grow much larger than the others.  This in itself may not be an issue, but 
when we have seen it it has been in the course of investigating a failure of 
some sort.  In any case, we have run into this scenario again while pursuing 
another issue, and have a way to replicate the behavior in question.  

Basically, if you have a FlowFile with a large attribute and then split / clone 
that FlowFile, all of the new entries end up in the same partition\[1].  So, 
for example, a 100K attribute (which is admittedly  a questionable situation in 
itself) split / cloned 1000 times will result in a 100 MB partition.  I don't 
have a terribly scientific way of describing it, but this generally seems to 
make NiFi upset.  We should maybe look into ways of splitting records among 
multiple partitions if the records are over some size threshold...

\[1]https://github.com/apache/nifi/blob/1be08714731f01347ac1f98e18047fe7d9ab8afd/nifi-commons/nifi-write-ahead-log/src/main/java/org/wali/MinimalLockingWriteAheadLog.java#L238



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to