Follow up question - how do I transition to this new structure? Should I shut down NiFi and move the contents of the legacy single directories into one of the new ones? For example:
mv /usr/nifi/content_repository /nifi/repos/content-1 TIA Phil On Wed, 12 Sep 2018 at 06:15, Mark Payne <marka...@hotmail.com> wrote: > Phil, > > For the content repository, you can configure the directory by changing > the value of > the "nifi.content.repository.directory.default" property in > nifi.properties. The suffix here, > "default" is the name of this "container". You can have multiple > containers by adding extra > properties. So, for example, you could set: > > nifi.content.repository.directory.content1= > /nifi/repos/content-1 > > nifi.content.repository.directory.content2=/nifi/repos/content-2 > nifi.content.repository.directory.content3=/nifi/repos/content-3 > nifi.content.repository.directory.content4=/nifi/repos/content-4 > > Similarly, the Provenance Repo property is named > "nifi.provenance.repository.directory.default" > and can have any number of "containers": > > nifi.provenance.repository.directory.prov1=/nifi/repos/prov-1 > nifi.provenance.repository.directory.prov2=/nifi/repos/prov-2 > nifi.provenance.repository.directory.prov3=/nifi/repos/prov-3 > nifi.provenance.repository.directory.prov4=/nifi/repos/prov-4 > > When NiFi writes to these, it does a Round Robin so that if you're writing > to 4 Flow Files' > content simultaneously with different threads, you're able to get the full > throughput of each > disk. (So if you have 4 disks for your content repo, each capable of > writing 100 MB/sec, then > your effective write rate to the content repo is 400 MB/sec). Similar with > Provenance Repository. > > Doing this also will allow you to hold a larger 'archive' of content and > provenance data, because > it will span the archive across all of the listed directories, as well. > > Thanks > -Mark > > > > > On Sep 11, 2018, at 3:35 PM, Phil H <gippyp...@gmail.com> wrote: > > > > Thanks Mark, this is great advice. > > > > Disk access is certainly an issue with the current set up. I will > certainly > > shoot for NVMe disks in the build. How does NiFi get configured to span > > it's repositories across multiple physical disks? > > > > Thanks, > > Phil > > > > On Wed, 12 Sep 2018 at 01:32, Mark Payne <marka...@hotmail.com> wrote: > > > >> Phil, > >> > >> As Sivaprasanna mentioned, your bottleneck will certainly depend on your > >> flow. > >> There's nothing inherent about NiFi or the JVM, AFAIK that would limit > >> you. I've > >> seen NiFi run on VM's containing 4-8 cores, and I've seen it run on bare > >> metal > >> on servers containing 96+ cores. Most often, I see people with a lot of > >> CPU cores > >> but insufficient disk, so if you're running several cores ensure that > >> you're using > >> SSD's / NVMe's or enough spinning disks to accommodate the flow. NiFi > does > >> a good > >> job of spanning the content and FlowFile repositories across multiple > >> disks to take > >> full advantage of the hardware, and scales the CPU vertically by way of > >> multiple > >> Processors and multiple concurrent tasks (threads) on a given Processor. > >> > >> It really comes down to what you're doing in your flow, though. If > you've > >> got 96 cores and > >> you're trying to perform 5 dozen transformations against a large number > of > >> FlowFiles > >> but have only a single spinning disk, then those 96 cores will likely go > >> to waste, because > >> your disk will bottleneck you. > >> > >> Likewise, if you have 10 SSD's and only 8 cores you're likely going to > >> waste a lot of > >> disk because you won't have the CPU needed to reach the disks' full > >> potential. > >> So you'll need to strike the correct balance for your use case.Since you > >> have the > >> flow running right now, I would recommend looking at things like `top` > and > >> `iostat` in order > >> to understand if you're reaching your limit on CPU, disk, etc. > >> > >> As far as RAM is concerned, NiFI typically only needs 4-8 GB of ram for > >> the heap. However, > >> more RAM means that your operating system can make better use of disk > >> caching, which > >> can certainly speed things up, especially if you're reading the content > >> several times for > >> each FlowFile. > >> > >> Does this help at all? > >> > >> Thanks > >> -Mark > >> > >> > >>> On Sep 10, 2018, at 6:05 AM, Phil H <gippyp...@gmail.com> wrote: > >>> > >>> Thanks for that. Sorry I should have been more specific - we have a > flow > >>> running already on non-dedicated hardware. Looking to identify any > >>> limitations in NiFi/JVM that would limit how much parallelism it can > take > >>> advantage of > >>> > >>> On Mon, 10 Sep 2018 at 14:32, Sivaprasanna <sivaprasanna...@gmail.com> > >>> wrote: > >>> > >>>> Phil, > >>>> > >>>> The hardware requirements are driven by the nature of the dataflow you > >> are > >>>> developing. If you're looking to play around with NiFi and gain some > >>>> hands-on experience, go for a 4 core 8GB RAM i.e. any modern > >>>> laptops/computer would do the job. In my case, where I'm having 100s > of > >>>> dataflows, I have it clustered with 3 nodes. Each having 16GB RAM and > >> 4(8) > >>>> cores. I went with SSDs of smaller size because my flows are involved > in > >>>> writing to object stores like Google Cloud Storage, Azure Blob and > >> Amazon > >>>> S3 and NoSQL DBs. Hope this helps. > >>>> > >>>> - > >>>> Sivaprasanna > >>>> > >>>> On Mon, Sep 10, 2018 at 4:09 AM Phil H <gippyp...@gmail.com> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> I've been asked to spec some hardware for a NiFi installation. Does > >>>> anyone > >>>>> have any advice? My gut feel is lots of processor cores and RAM, with > >>>> less > >>>>> emphasis on storage (small fast disks). Are there any limitations on > >> how > >>>>> many cores the JRE/NiFi can actually make use of, or any other > >>>>> considerations like that I should be aware of? > >>>>> > >>>>> Most likely will be pairs of servers in a cluster, but again any > advice > >>>> to > >>>>> the contrary would be appreciated. > >>>>> > >>>>> Cheers, > >>>>> Phil > >>>>> > >>>> > >> > >> > >