Olav,

How large is your content repository?

How large is a large file?

How many transformation steps exist in your flow from receipt through
delivery of that large file?

Thanks
Joe

On Mon, Apr 24, 2017 at 9:31 PM, Olav Jordens <
olav.jord...@2degreesmobile.co.nz> wrote:

> Apologies – forgot to mention that I am on nifi 1.1.2. on Linux RHEL 6.5
>
>
>
> Thanks,
>
> Olav
>
>
>
>
>
> Olav Jordens
> Senior ETL Developer
> +64 226 202 429
> +64 9 919 7000 <+64%209-919%207000>
> 2degreesmobile.co.nz <http://www.2degreesmobile.co.nz>
> Two Degrees Mobile Limited | 47-49 George Street | Newmarket | Auckland |
> New Zealand
> PO Box 8355 | Symonds Street | Auckland 1150 | New Zealand | Fax +64 9
> 919 7001 <+64%209-919%207001>
>
> ------------------------------
>
> Disclaimer
> The e-mail and any files transmitted with it are confidential and may
> contain privileged or copyright information. If you are not the intended
> recipient you must not copy, distribute, or use this e-mail or the
> information contained in it for any purpose other than to notify us of the
> error. If you have received this message in error, please notify the sender
> immediately, by email or phone (+64 9 919 7000 <+64%209-919%207000>) and
> delete this email from your system. Any views expressed in this message are
> those of the individual sender, except where the sender specifically states
> them to be the views of Two Degrees Mobile Limited. We do not guarantee
> that this material is free from viruses or any other defects although due
> care has been taken to minimize the risk
>
>
> *From:* Olav Jordens
> *Sent:* Tuesday, 25 April 2017 1:27 p.m.
> *To:* 'users@nifi.apache.org' <users@nifi.apache.org>
> *Subject:* Content repository filling up
>
>
>
> Hi Users,
>
>
>
> I have had this problem intermittently for some time now – the content
> repository disk fills up even though there appear to be very few flow files
> in the system.
>
> I have read the very good explanation of content claims here:
> https://community.hortonworks.com/articles/82308/understanding-how-nifis-
> content-repository-archivi.html
>
>
>
> My data flows includes a mix of very large and very small files, and so I
> suspect that the small files within a claim are locking the large ones. I
> have followed the suggestion in the above link:
>
>
>
> *If you are working with data that ranges greatly from very small to very
> large, you may want to decrease the max appendable size and/or max flow
> file settings. By doing so you decrease the number of FlowFiles that make
> it into a single claim. This in turns reduces the likelihood of a single
> piece of data keeping large amounts of data still active in your content
> repository.*
>
>
>
> I have tried the most radical approach – one content claim per file which
> I believe should imply that as soon as a large file leaves the flow, it is
> available for removal as I have set archiving to false.
>
> My issue is that even with these settings, the nifi content repository
> fills up, and when I look inside the content repository, I see multiple
> flowfile contents contained within a single claim file, which is unexpected
> as I have set nifi.content.claim.max.flow.files=1.
>
>
>
>
>
> These are my content repository settings in nifi.properties:
>
>
>
> # Content Repository
>
> nifi.content.repository.implementation=org.apache.
> nifi.controller.repository.FileSystemRepository
>
> # Exceptionally important to get this right when having a mix of large and
> small files
>
> # We don't want a large file to be in the same claim as a small file which
> remains queued:
>
> # The claim can never be released until the small file is no longer
> enqueued and has been released
>
> # Large files, first into a claim, will take up an entire claim anyway.
>
> # So setting max.flow.files=1, there is no need to configure
> max.appendable.size
>
> nifi.content.claim.max.appendable.size=10 MB
>
> #nifi.content.claim.max.flow.files=100
>
> nifi.content.claim.max.flow.files=1
>
>
>
> #OPT
>
> #nifi.content.repository.directory.default=./content_repository
>
> nifi.content.repository.directory.default=/app/nifi/
> common/content_repository
>
>
>
> # Archiving of content is disabled - no need to keep data hanging around
> once the flow is complete.
>
> nifi.content.repository.archive.max.retention.period=12 hours
>
> nifi.content.repository.archive.max.usage.percentage=50%
>
> #nifi.content.repository.archive.enabled=true
>
> nifi.content.repository.archive.enabled=false
>
> nifi.content.repository.always.sync=false
>
> nifi.content.viewer.url=/nifi-content-viewer/
>
>
>
> Am I looking at this incorrectly?
>
>
>
> Thanks,
>
> Olav
>
>
>
>
>
>

Reply via email to