Hey Phil,

NiFi will not spread the content of a single file over multiple partitions. It 
will write the content of FlowFile 1 to content repo 1, then write the next 
FlowFile to repo 2, etc. so it does round-robin but does not spread a single 
FlowFile across multiple repos.

Thanks
-Mark

Sent from my iPhone

> On Dec 11, 2023, at 8:45 PM, Phillip Lord <phillord0...@gmail.com> wrote:
> 
> 
> Hello Nifi comrades,
> 
> Here's my scenario...
> Let's say I have a Nifi cluster running on EC2 instances with attached EBS 
> volumes serving as their repos.  They've split up their content-repos into 
> three content-repos per node(cont1, cont2, cont3).  Each being a dedicated 
> EBS volume.  My understanding is that the content-claims for a single file 
> can potentially span across more than one of these repos.(correct me if I've 
> lost my mind over the years)
> For instance if you have a 1 MB file, and lets say your 
> max.content.claim.size is 100KB, that's 10 - 100KB claims(ish) potentially 
> split up across the 3 EBS volumes.  So if Nifi is trying to move that file to 
> S3 or something for instance... it needs to be read from each of the volumes. 
>  
> Whereas if it was a single EBS volume for the cont-repo... it would read from 
> the single volume, which I would think would be more performant?  Or does 
> spreading out any IO contention across volumes provide more of a benefit?
> I know there's different levels of EBS volumes... but not factoring that in 
> for right now.
> 
> Appreciate any insight... trying to determine the best configuration.  
> 
> Thanks,
> Phil
> 
> 

Reply via email to