Brett,

The default provenance store, PersistentProvenanceRepository, does require
I/O in proportion to flowfile events.  Flowfiles with many attributes,
especially large attributes, are a frequent contributor to provenance
overload because attribute state is tracked in provenance events.  But this
is different from flowfile content reads and writes, which use the separate
content repository.  You might consider moving the provenance repository to
a separate disk for additional I/O capacity.

Does this sound relevant?  Can you share some details of your flow volumes
and attribute sizes?

nifi.provenance.repository.buffer.size is only used by the
VolatileProvenanceRepository implementation, an in-memory provenance
store.  The property defines the size of the in-memory store.  The volatile
store can avoid disk I/O issues, but at the expense of reduced provenance
functionality.

Thanks,

James

On Thu, Sep 29, 2016 at 1:37 PM, Brett Tiplitz <
brett.m.tipl...@systolic-inc.com> wrote:

> I'm having a throughput problem when processing data with Provenance
> recording enabled.  I've pretty much disabled it, so I believe that is the
> source of my issue.  On occasion, I get a message saying the flow is
> slowing due to provenance recording.  I was running the out of the box
> configuration for provenance.
>
> I believe the issue might be related to commit writes, though it's just a
> theory.  There is a variable nifi.provenance.repository.buffer.size,
> though I don't see anything about what that does.
>
> Any suggestions ?
>
> thanks,
>
> brett
>
> --
> Brett Tiplitz
> Systolic, Inc
>

Reply via email to