I suppose the feature 'Ability to specify group level flow file concurrency - 
for instance run a single flow file end to end before running another for 
traditional job handling' available from Version 1.12 upward should be helpful 
here (have not tried myself yet)

-----Ursprüngliche Nachricht-----
Von: Van Autreve Dries <dries.vanautr...@vlaanderen.be>
Gesendet: Donnerstag, 1. April 2021 09:36
An: users@nifi.apache.org
Betreff: Strict order of flow files in a cluster

Hello all

We recently started using NiFi and we were wondering if strict order of 
processing flow files in a cluster could be guaranteed by NiFi.

One of the use cases is as following: messages arrive in a specific order, go 
through a simple flow with some basic transformations and are written to the 
destination (usually a relational database). The source of the messages can be 
a database, Kafka queue, … It’s important that messages are written to the 
destination in exactly the same order they arrived at NiFi. The reason is that 
messages could be deltas and we do not want to overwrite newer data with older 
deltas. Moreover we do not always control the message format, hence controlling 
this from the messaging protocol point of view might not be possible.

We did some research in various places but have not found a satisfying answer. 
Our own investigations have revealed that:
- Just running the first processor on the primary node is not enough even with 
a load balancing strategy “single node”. While testing with stopping / starting 
the primary node we had some situations were messages got out of order.
- Using the EnforceOrder processor with high timeouts prevented the messages 
getting processed out of order, but each time the primary node changes, manual 
intervention is required to reconfigure the initial order property. Moreover it 
requires that the source system or first processor provides this incrementing 
sequence attribute.

It seems also not possible to pinpoint a flow to a specific node. At least we 
have not found this option. We do understand that this would affect scalability 
and availability or failover, but might be acceptable for those specific cases.

If there are other options we can explore, any input would be helpful.
Or if it’s not (easily) possible with NiFi on its own, it would be good to know!

--
Kind Regards
Dries Van Autreve


(Sorry if this will result in a double post. I was not yet subscribed when I 
did the first post and my message does not seem to appear in the list...)



Harald Dobbernack

Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | 
www.key-work.de<https://www.key-work.de> | 
Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbern...@key-work.de

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring

Reply via email to