When first starting with NiFi (I've always used it in container form), I
looked at taking a similar approach with the flow.xml.gz, but quickly moved
away from that after reading various tales online about it not always being
very portable and/or people having issues keeping it in sync (remembering
to extract it from the container while still running and persisting it
externally, then injecting it again on startup *or* trying to use a bind
mount and relying on the container always being run by the same node, etc.).

Instead, we went with using a persistent volume for the flow.xml.gz
directory (having changed that from it's default location via the
nifi.properties settings), which allows the instance to persist its Flow
between restarts (running this first in a Docker Swarm, then subsequently
in Kubernetes). Externalisation of this and promotion between environments
(e.g. if you have Dev, Test, Prod, etc.) was first using Templates, which
can be exported directly from and imported directly into the NiFi UI as XML
files (which you can, of course, store in a source control repository as
desired, although I'd not look to do much in the way of Pull Request
reviews using them).

Templates are now deprecated in NiFi and instead the normally recommended
approach would be to spin up a NiFi Registry instance, have your Flows
within one or more Process Groups and version control those between NiFi UI
and NiFi Registry. The Flows can also be exported as JSON either from NiFi
itself or Registry and again stored in source control then re-imported
later (sometimes I've found code reviews possible on such files, but only
for relatively minor Flow changes). Even better would be to use Git as the
Flow Persistence layer within Registry, which can then send the Flow
definitions straight to a Git repo - Registry can also rebuild from this if
it loses its stored metadata, etc.

Using Registry allows for some reviews directly in the NiFi UI as it tells
you what changes have been made to a Flow definition before they're
committed to Registry. But, of course, you've then got another container to
look at spinngin up and orchestrating in your stack (I'd say it's certainly
been worth the effort having spent a bit of time using the tools now).


Hope that helps.

*Chris Sampson*
IT Consultant
chris.samp...@naimuri.com



On Wed, 8 Apr 2020 at 17:36, Kevin Telford <kevin.telf...@gmail.com> wrote:

> Thank you Mike. This makes sense.
>
> How do you get the flow.xml.gz file? Do you have NiFi installed locally on
> bare metal or do you develop also in a container? Initially I was thinking
> of adding a custom volume to facilitate both getting and deploying the flow
> file.
>
> Best,
> Kevin
>
> On 2020/04/08 14:37:43, Mike Thomsen <mikerthom...@gmail.com> wrote:
> > What I've done in the past looks something like this:
> >
> > FROM apache/nifi:1.11.4
> > COPY flow.xml.gz /opt/nifi/nifi-current/conf/flow.xml.gz
> >
> > And that's it. The obvious caveat is that you need to follow good
> practices
> > with ensuring that your flow and the way you setup the container can
> > replicate the environment you built it on on your host.
> >
> > On Wed, Apr 8, 2020 at 10:34 AM Kevin Telford <kevin.telf...@gmail.com>
> > wrote:
> >
> > > Hi all – I have a two part question.
> > >
> > >
> > >
> > > I’d like to run NiFi inside a container in order to deploy to various
> > > environments. As far as I can tell, the flow.xml.gz file is the main
> > > “source” if you will, for a NiFi data flow.
> > >
> > > Q1) Is the flow.xml.gz file the “source” of a NiFi data flow, and if
> so, is
> > > it best practice to copy it to a new env in order to “deploy” a
> prebuilt
> > > flow? Or how best is this handled?
> > >
> > >
> > >
> > > Given that Q1 is true, my challenge then becomes somewhat
> Docker-specific…
> > >
> > > Situation:
> > >
> > >    - In the Dockerfile we unzip the NiFi source (L62
> > >    <
> > >
> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L62
> > > >)
> > >    and then create Docker volumes (L75
> > >    <
> > >
> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L75
> > > >
> > >    specifically for the conf dir). Once the container starts all the
> normal
> > >    NiFi startup things happen, and
> /opt/nifi/nifi-current/conf/flow.xml.gz
> > >    created.
> > >
> > > Complication:
> > >
> > >    - In order to persist flow.xml.gz outside of the container, I would
> > >    normally mount the /opt/nifi/nifi-current/conf directory, however in
> > > this
> > >    case I cannot mount it on initialization because that will overwrite
> > > conf
> > >    config files with whatever directory I bind it to (Docker container
> > >    isolation ensures host -> container file precedence).
> > >    - I could mount to a running container, but this is less ideal due
> to
> > >    the various ways a container can be deployed.
> > >    - I could copy manually from the running container, but this is less
> > >    ideal as it’s on demand, and not always persisting latest.
> > >
> > > Resolution:
> > >
> > >    - I believe instead, we would ideally create a few flow config
> specific
> > >    env vars and use them to update our nifi.properties (via
> > >
> > >
> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh
> > > ),
> > >    i.e. NIFI_FLOW_CONFIG_FILE_LOCATION,
> NIFI_FLOW_CONFIG_ARCHIVE_ENABLED,
> > >    NIFI_FLOW_CONFIG_ARCHIVE_DIR and so on for all
> nifi.flow.configuration
> > >    props.
> > >
> > > Q2) Would the above proposal be ideal? (add a few env vars to
> start.sh) –
> > > if so, happy to add a PR for the code and doc change. Or have others
> solved
> > > this a different way?
> > >
> > >
> > >
> > > Best,
> > >
> > > Kevin
> > >
> >
>

Reply via email to