Peter Wicks created NIFI-3484:
---------------------------------

             Summary: GenerateTableFetch Should Allow for Right Boundary
                 Key: NIFI-3484
                 URL: https://issues.apache.org/jira/browse/NIFI-3484
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Core Framework
    Affects Versions: 1.2.0
            Reporter: Peter Wicks
            Assignee: Peter Wicks
            Priority: Minor
             Fix For: 1.2.0, 1.1.0


When using GenerateTableFetch it places no right hand boundary on pages of 
data.  This can lead to issues when the statement says to get the next 1000 
records greater then a specific key, but records were added to the table 
between the time the processor executed and when the SQL is being executed. As 
a result it pulls in records that did not exist when the processor was run.  On 
the next execution of the processor these records will be pulled in a second 
time.

Example:

Partition Size = 1000
First run (no state): Count(*)=4700 and MAX(ID)=4700.
5 FlowFiles are generated, the last one will say to fetch 1000, not 700. (But I 
don't think this is really a bug, just an observation).

5 Flow Files are now in queue to be executed by ExecuteSQL.  Before the 5th 
file can execute 400 new rows are added to the table.  When the final SQL 
statement is executed 300 extra records, with higher ID values, will also be 
pulled into NiFi.

Second run (state: ID=4700).  Count(*) ID>4700 = 400 and MAX(ID)=5100.
1 Flow File is generated, but includes 300 records already pulled into NiFI.

The solution is to have an optional property that will let users use the new 
MAX(ID) as a right boundary when generating queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to