Hey Phil, NiFi will not spread the content of a single file over multiple partitions. It will write the content of FlowFile 1 to content repo 1, then write the next FlowFile to repo 2, etc. so it does round-robin but does not spread a single FlowFile across multiple repos.
Thanks -Mark Sent from my iPhone > On Dec 11, 2023, at 8:45 PM, Phillip Lord <phillord0...@gmail.com> wrote: > > > Hello Nifi comrades, > > Here's my scenario... > Let's say I have a Nifi cluster running on EC2 instances with attached EBS > volumes serving as their repos. They've split up their content-repos into > three content-repos per node(cont1, cont2, cont3). Each being a dedicated > EBS volume. My understanding is that the content-claims for a single file > can potentially span across more than one of these repos.(correct me if I've > lost my mind over the years) > For instance if you have a 1 MB file, and lets say your > max.content.claim.size is 100KB, that's 10 - 100KB claims(ish) potentially > split up across the 3 EBS volumes. So if Nifi is trying to move that file to > S3 or something for instance... it needs to be read from each of the volumes. > > Whereas if it was a single EBS volume for the cont-repo... it would read from > the single volume, which I would think would be more performant? Or does > spreading out any IO contention across volumes provide more of a benefit? > I know there's different levels of EBS volumes... but not factoring that in > for right now. > > Appreciate any insight... trying to determine the best configuration. > > Thanks, > Phil > >