[
https://issues.apache.org/jira/browse/NIFI-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tomasz Korniszuk reassigned NIFI-14549:
---------------------------------------
Assignee: Tomasz Korniszuk
> Saving state for ExecuteSQL and ExecuteSQLRecord processor
> ----------------------------------------------------------
>
> Key: NIFI-14549
> URL: https://issues.apache.org/jira/browse/NIFI-14549
> Project: Apache NiFi
> Issue Type: Improvement
> Affects Versions: 2.4.0
> Environment: Docker version: 28.0.4; docker image: apache/nifi:2.4.0;
> Host OS: DEbian 12
> Reporter: Andrej
> Assignee: Tomasz Korniszuk
> Priority: Major
>
> Saving state for ExecuteSQL and ExecuteSQLRecord processor:
> It would be much easier to incrementally load data with complex queries with
> state recorded from previous run.
> My example: I need to transfer data from MS SQL Extended event from system
> table-valued function named sys.fn_xe_file_target_read_file. Since very large
> amount of data is produced every minute, I need to use function parameter to
> query only data with certain extended event file and from file offset. I need
> to remember last values for next query run.
> Processor QueryDatabaseTableRecord records state with Maximum-value Columns
> would do this, however it work with subqueries, which means in this case
> every time all the data is read, and then filtered. I can not afford this
> approach since there are millions of rows.
>
> Currently I am solving this with Groovy Script to get state from json file
> --> ExecuteSQLRecord --> Groovy Script to get last record --> write to json
> file. All this needs to be in Sub-Process Group where I am allowing only one
> flowfile at the time. This is very complex, prone to error and slow.
>
> So statefull ExecuteSQLRecord would remove all this trouble.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)