[ 
https://issues.apache.org/jira/browse/NIFI-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449998#comment-15449998
 ] 

Matt Burgess commented on NIFI-2712:
------------------------------------

Proposed solution is to replace > with >= for all but the first max-value column

> Database Fetch processors' max-value columns don't work as expected
> -------------------------------------------------------------------
>
>                 Key: NIFI-2712
>                 URL: https://issues.apache.org/jira/browse/NIFI-2712
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>
> Currently, for QueryDatabaseTable and GenerateTableFetch, the user can enter 
> any number of maximum-value columns, which are used to generate a SQL query 
> that will fetch all records whose values are greater than the last-observed 
> maximum values for those columns.
> However this makes multiple max-value columns not very useful, since they 
> will both have to increase in lockstep or records will be lost/skipped. In 
> such a case, using one or the other (but not both) would suffice, making 
> multiple max-value columns useless.
> The more likely use case is that there are multiple columns whose values are 
> strictly increasing, but at different rates. This is common with very large 
> tables where a column could be for "date_created" and also a "bucket number" 
> that strictly increases once a day. Queries for a day's worth of data are 
> more efficient if they can be filtered on "bucket" (in this case), then on 
> timestamp. However the generated SQL queries would have to reflect that 
> "bucket" may remain the same as timestamp is increasing, but once the bucket 
> value has increased, then only the (new) timestamps for that bucket should be 
> fetched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to