Thanks for that context and perspective, Kevin. Good luck on your project. 

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Apr 13, 2020, at 6:07 AM, Kevin Telford <kevin.telf...@gmail.com> wrote:
> 
> Hi all - thank you for all the thoughtful feedback.
> 
> Regarding my original question, I think the patterns Mike outlined would be 
> good enough.
> That said, we're not going to move forward using NiFi for the project, and I 
> figured I'd take a step back to explain where we were coming from, as some 
> may find the perspective useful. Or not :)
> 
> 
> We have a project that needs some data transformation. Input is excel, output 
> multiple CSVs or POSTs of data to an API. On the surface, simple enough.
> 
> Our input Excel can and will change a lot, so we'll need rapid iterations, 
> and testing.
> 
> The project architecture is container-based, currently consisting of a front 
> end docker image, a back end image, and database image. ETL is intended to be 
> a fourth. It can be orchestrated with Docker Compose, K8, or bare metal. The 
> goal is to be turn key and low friction.
> 
> There were two reasons we didn't choose NiFi - the painful (read: long) Java 
> deployment lifecycle for custom processing, and system complexity, 
> particularly around updating new flows.
> 
> Regarding the pain of Java, I've partied with Java since 1.4, so I get it. 
> But these days, if I have a data analyst/data engineer with lowish 
> programming skills, I can't have them compiling and moving around jars, nor 
> do I want to invest in building out the build/deploy pipeline. Platforms have 
> really evolved (especially look at the cloud native tools), and code can be 
> written "in line" in the UI, and just deployed. A lot of this is due to 
> dynamic languages (e.g. Python), but it can still be done with Java with 
> behind the scenes compilation. Juypter Notebook, for it's many, many faults, 
> is the way things are heading, and the kids love it.
> 
> I touched a lot on updating flows above, but in NiFi my choices seemed to be 
> to replace the Flow.xml.gz file, or use the NiFi Registry. My concern with 
> the registry was that it was yet another moving part, and even still I'd have 
> to build in source control workflows. Here again, newer platforms have all 
> this baked in.
> 
> 
> In closing, I think there is definitely still a place for NiFi, especially on 
> the enterprise side where stability, scale and management are paramount. But 
> I did want to share this, as these non-enterprise use cases I am describing 
> will, over time become the enterprise use cases, and the NiFi project would 
> do well to evaluate their long term strategy.
> 
> Thanks again for all the responses.
> Best,
> Kevin
> 
> On 2020/04/08 14:27:54, Kevin Telford <kevin.telf...@gmail.com> wrote: 
>> Hi all – I have a two part question.
>> 
>> 
>> 
>> I’d like to run NiFi inside a container in order to deploy to various
>> environments. As far as I can tell, the flow.xml.gz file is the main
>> “source” if you will, for a NiFi data flow.
>> 
>> Q1) Is the flow.xml.gz file the “source” of a NiFi data flow, and if so, is
>> it best practice to copy it to a new env in order to “deploy” a prebuilt
>> flow? Or how best is this handled?
>> 
>> 
>> 
>> Given that Q1 is true, my challenge then becomes somewhat Docker-specific…
>> 
>> Situation:
>> 
>>   - In the Dockerfile we unzip the NiFi source (L62
>>   
>> <https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L62>)
>>   and then create Docker volumes (L75
>>   
>> <https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/Dockerfile#L75>
>>   specifically for the conf dir). Once the container starts all the normal
>>   NiFi startup things happen, and /opt/nifi/nifi-current/conf/flow.xml.gz
>>   created.
>> 
>> Complication:
>> 
>>   - In order to persist flow.xml.gz outside of the container, I would
>>   normally mount the /opt/nifi/nifi-current/conf directory, however in this
>>   case I cannot mount it on initialization because that will overwrite conf
>>   config files with whatever directory I bind it to (Docker container
>>   isolation ensures host -> container file precedence).
>>   - I could mount to a running container, but this is less ideal due to
>>   the various ways a container can be deployed.
>>   - I could copy manually from the running container, but this is less
>>   ideal as it’s on demand, and not always persisting latest.
>> 
>> Resolution:
>> 
>>   - I believe instead, we would ideally create a few flow config specific
>>   env vars and use them to update our nifi.properties (via
>>   
>> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh),
>>   i.e. NIFI_FLOW_CONFIG_FILE_LOCATION, NIFI_FLOW_CONFIG_ARCHIVE_ENABLED,
>>   NIFI_FLOW_CONFIG_ARCHIVE_DIR and so on for all nifi.flow.configuration
>>   props.
>> 
>> Q2) Would the above proposal be ideal? (add a few env vars to start.sh) –
>> if so, happy to add a PR for the code and doc change. Or have others solved
>> this a different way?
>> 
>> 
>> 
>> Best,
>> 
>> Kevin
>> 

Reply via email to