Re: NIFI-7646 - Improve performance of MergeContent

Ryan Hendrickson Thu, 22 Apr 2021 07:58:27 -0700

Thanks Mark!

On Wed, Apr 21, 2021 at 8:48 PM Mark Payne <marka...@hotmail.com> wrote:


> Ryan,
>
> It gets a bit more complex than this, because the flowfiles may not always
> be accessed/read sequentially in exactly the same order that they live on
> disk, there’s concurrent threads/disk accessed to consider, etc. But in the
> best case scenarios, yes that is accurate.
>
> Keep in mind, though, that what you are comparing there is the performance
> of the disk accesses/reads, and that is, of course, not the entire picture.
> Lots more going on under the covers, so if you see a performance
> improvement of 20x in reading the content, that won’t mean a 20x
> improvement in overall throughout.
>
> But it sure won’t hurt! :)
>
> -Mark
>
> Sent from my iPhone
>
> > On Apr 21, 2021, at 8:34 PM, Ryan Hendrickson <
> ryan.andrew.hendrick...@gmail.com> wrote:
> >
> > https://issues.apache.org/jira/browse/NIFI-7646 - Improve performance
> of
> > MergeContent / others that read content of many small FlowFiles
> >
> > Hi,
> >   In reference to the ticket above, released in 1.13, the descriptions
> > says "if the FlowFile is small, say 200 bytes, the result is that we
> > perform 2+ disk accesses to read those 200 bytes (even though 4K - 8K is
> a
> > typical block size and could be read in the same amount of time as those
> > 200 bytes)."
> >
> >   To clarify, if the FlowFiles are never more than 1K, and the block size
> > is 4k, does that mean this improvement will read 4 FlowFiles with the
> > resources of 1?
> >
> >   This would be a 4:1 improvement.  Or in the 200 byte scenario, it would
> > be a 20:1 improvement?
> >
> > Thanks,
> > Ryan
>

Re: NIFI-7646 - Improve performance of MergeContent

Reply via email to