[ https://issues.apache.org/jira/browse/NIFI-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683989#comment-16683989 ]
ASF GitHub Bot commented on NIFI-5812: -------------------------------------- Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/3167 I did something like that for the Hive 1 version of PutHiveStreaming, the underlying library wasn't thread-safe if you were working on the same table, so I put in a "table lock" where multiple threads couldn't act on the same table. With QDT it doesn't take incoming flow files, so the table name is effectively hard-coded. By adding PrimaryNodeOnly we can guarantee that QDT only has one instance (it is already TriggerSerially so can't have multiple threads). For GTF if you use ListDatabaseTables on the primary node only (not sure if we should force that with this annotation or not) then each flow file should have a different table, and using a load-balanced connection (or RPG -> Input Port) then each instance of GTF should be working on a different table. The onus is on the user to set Max Concurrent Tasks for GTF to 1 to prevent multi-threaded execution. Perhaps for GTF, instead of forcing PrimaryNodeOnly, we can make it clear in the doc that if there are no incoming connections, it should probably be run on the primary node only. Alternatively, maybe during annotation processing we can enforce that processors annotated with PrimaryNodeOnly automatically have an InputRequirement of INPUT_FORBIDDEN? Otherwise flow files can get stalled if they are in the queue on a node that is not the primary. > Make database processors as 'PrimaryNodeOnly' > --------------------------------------------- > > Key: NIFI-5812 > URL: https://issues.apache.org/jira/browse/NIFI-5812 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions > Affects Versions: 1.7.0, 1.8.0, 1.7.1 > Reporter: Sivaprasanna Sethuraman > Assignee: Sivaprasanna Sethuraman > Priority: Major > > With NIFI-543, we have introduced an behavior annotation to mark a particular > processor to run only on the Primary Node. It is recommended to mark the > following database related processors as 'PrimaryNodeOnly': > * QueryDatabaseTable > * GenerateTableFetch -- This message was sent by Atlassian JIRA (v7.6.3#76005)