[ https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Moser updated NIFI-3376: -------------------------------- Summary: Content repository disk usage is not close to reported size in Status Bar (was: Implement content repository ResourceClaim compaction) > Content repository disk usage is not close to reported size in Status Bar > ------------------------------------------------------------------------- > > Key: NIFI-3376 > URL: https://issues.apache.org/jira/browse/NIFI-3376 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Affects Versions: 0.7.1, 1.1.1 > Reporter: Michael Moser > Assignee: Michael Hogue > Attachments: NIFI-3376_Content_Repo_size_demo.xml > > > On NiFi systems that deal with many files whose size is less than 1 MB, we > often see that the actual disk usage of the content_repository is much > greater than the size of flowfiles that NiFi reports are in its queues. As > an example, NiFi may report "50,000 / 12.5 GB" but the content_repository > takes up 240 GB of its file system. This leads to scenarios where a 500 GB > content_repository file system gets 100% full, but "I only had 40 GB of data > in my NiFi!" > When several content claims exist in a single resource claim, and most but > not all content claims are terminated, the entire resource claim is still not > eligible for deletion or archive. This could mean that only one 10 KB > content claim out of a 1 MB resource claim is counted by NiFi as existing in > its queues. > If a particular flow has a slow egress point where flowfiles could back up > and remain on the system longer than expected, this problem is exacerbated. > A potential solution is to compact resource claim files on disk. A background > thread could examine all resource claims, and for those that get "old" and > whose active content claim usage drops below a threshold, then rewrite the > resource claim file. > A potential work-around is to allow modification of the FileSystemRepository > MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number. This would increase > the probability that the content claims reference count in a resource claim > would reach 0 and the resource claim becomes eligible for deletion/archive. > Let users trade-off performance for more accurate accounting of NiFi queue > size to content repository size. -- This message was sent by Atlassian JIRA (v6.4.14#64029)