When first starting with NiFi (I've always used it in container form), I looked at taking a similar approach with the flow.xml.gz, but quickly moved away from that after reading various tales online about it not always being very portable and/or people having issues keeping it in sync (remembering to extract it from the container while still running and persisting it externally, then injecting it again on startup *or* trying to use a bind mount and relying on the container always being run by the same node, etc.).
Instead, we went with using a persistent volume for the flow.xml.gz directory (having changed that from it's default location via the nifi.properties settings), which allows the instance to persist its Flow between restarts (running this first in a Docker Swarm, then subsequently in Kubernetes). Externalisation of this and promotion between environments (e.g. if you have Dev, Test, Prod, etc.) was first using Templates, which can be exported directly from and imported directly into the NiFi UI as XML files (which you can, of course, store in a source control repository as desired, although I'd not look to do much in the way of Pull Request reviews using them). Templates are now deprecated in NiFi and instead the normally recommended approach would be to spin up a NiFi Registry instance, have your Flows within one or more Process Groups and version control those between NiFi UI and NiFi Registry. The Flows can also be exported as JSON either from NiFi itself or Registry and again stored in source control then re-imported later (sometimes I've found code reviews possible on such files, but only for relatively minor Flow changes). Even better would be to use Git as the Flow Persistence layer within Registry, which can then send the Flow definitions straight to a Git repo - Registry can also rebuild from this if it loses its stored metadata, etc. Using Registry allows for some reviews directly in the NiFi UI as it tells you what changes have been made to a Flow definition before they're committed to Registry. But, of course, you've then got another container to look at spinngin up and orchestrating in your stack (I'd say it's certainly been worth the effort having spent a bit of time using the tools now). Hope that helps. *Chris Sampson* IT Consultant chris.samp...@naimuri.com On Wed, 8 Apr 2020 at 17:36, Kevin Telford <kevin.telf...@gmail.com> wrote: > Thank you Mike. This makes sense. > > How do you get the flow.xml.gz file? Do you have NiFi installed locally on > bare metal or do you develop also in a container? Initially I was thinking > of adding a custom volume to facilitate both getting and deploying the flow > file. > > Best, > Kevin > > On 2020/04/08 14:37:43, Mike Thomsen <mikerthom...@gmail.com> wrote: > > What I've done in the past looks something like this: > > > > FROM apache/nifi:1.11.4 > > COPY flow.xml.gz /opt/nifi/nifi-current/conf/flow.xml.gz > > > > And that's it. The obvious caveat is that you need to follow good > practices > > with ensuring that your flow and the way you setup the container can > > replicate the environment you built it on on your host. > > > > On Wed, Apr 8, 2020 at 10:34 AM Kevin Telford <kevin.telf...@gmail.com> > > wrote: > > > > > Hi all – I have a two part question. > > > > > > > > > > > > I’d like to run NiFi inside a container in order to deploy to various > > > environments. As far as I can tell, the flow.xml.gz file is the main > > > “source” if you will, for a NiFi data flow. > > > > > > Q1) Is the flow.xml.gz file the “source” of a NiFi data flow, and if > so, is > > > it best practice to copy it to a new env in order to “deploy” a > prebuilt > > > flow? Or how best is this handled? > > > > > > > > > > > > Given that Q1 is true, my challenge then becomes somewhat > Docker-specific… > > > > > > Situation: > > > > > > - In the Dockerfile we unzip the NiFi source (L62 > > > < > > > > https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L62 > > > >) > > > and then create Docker volumes (L75 > > > < > > > > https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L75 > > > > > > > specifically for the conf dir). Once the container starts all the > normal > > > NiFi startup things happen, and > /opt/nifi/nifi-current/conf/flow.xml.gz > > > created. > > > > > > Complication: > > > > > > - In order to persist flow.xml.gz outside of the container, I would > > > normally mount the /opt/nifi/nifi-current/conf directory, however in > > > this > > > case I cannot mount it on initialization because that will overwrite > > > conf > > > config files with whatever directory I bind it to (Docker container > > > isolation ensures host -> container file precedence). > > > - I could mount to a running container, but this is less ideal due > to > > > the various ways a container can be deployed. > > > - I could copy manually from the running container, but this is less > > > ideal as it’s on demand, and not always persisting latest. > > > > > > Resolution: > > > > > > - I believe instead, we would ideally create a few flow config > specific > > > env vars and use them to update our nifi.properties (via > > > > > > > https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh > > > ), > > > i.e. NIFI_FLOW_CONFIG_FILE_LOCATION, > NIFI_FLOW_CONFIG_ARCHIVE_ENABLED, > > > NIFI_FLOW_CONFIG_ARCHIVE_DIR and so on for all > nifi.flow.configuration > > > props. > > > > > > Q2) Would the above proposal be ideal? (add a few env vars to > start.sh) – > > > if so, happy to add a PR for the code and doc change. Or have others > solved > > > this a different way? > > > > > > > > > > > > Best, > > > > > > Kevin > > > > > >