Re: [PR] NIFI-14736 Add EOF attribute to ExecuteSQL for the last FlowFile output for a ResultSet [nifi]

via GitHub Thu, 31 Jul 2025 09:14:06 -0700


takraj commented on PR #10102:
URL: https://github.com/apache/nifi/pull/10102#issuecomment-3140550513

@exceptionfactory

> ... what about another method like isAfterLast()? It can depend on the
JDBC implementation, but could be an option to consider.

This sounds very risky, since it depends on the driver implementation. And
there are also other concerns:

*Concern 1:* The result set is driven by the `SqlWriter`, like so:
`sqlWriter.writeResultSet(resultSet, out, getLogger(), null)`. This means that
cursor operations are delegated to another class, including end of resultset
check. This is not the biggest problem, but it might complicate things further.

*Concern 2:* I'm not familiar with this entirely, but according to the
current implementation, the result set might be fragmented in some way. If we
reach the 'end' of the result set, there is a logic to get further results,
which can mean we might or might not create further Flow Files. And moreover,
at this point we might have already sent out the last Flow File. So this is
something, that is pretty difficult to tackle with.

Source:
https://github.com/apache/nifi/blob/d9b4984b1270f7659660e0ae58a52afe6a44f7e3/nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractExecuteSQL.java#L444

```java
// are there anymore result sets?
try {
hasResults = st.getMoreResults(Statement.CLOSE_CURRENT_RESULT);
hasUpdateCount = st.getUpdateCount() != -1;
} catch (SQLException ex) {
hasResults = false;
hasUpdateCount = false;
}
```

Meanwhile, I have realized that, instead of the new EOF attribute, we could
reuse the already existing `fragment.count`. The documentation says this is
missing only in case of a set `Output Batch Size`. But with my change here, we
could add this attribute to the FlowFiles in the very last batch. Practically
meaning that at least the last FlowFile will contain information about the
total number of fragments.

Now, since these `fragment.*` attributes are primarily for `Merge*`
processors, if those can work with only one FlowFile having this information,
we have already achieved our goal: The FlowFiles from `ExecuteSQL*` can be
merged into a single FlowFile, even if `Output Batch Size` is set.

And this scenario is already supported by `MergeContent`, since its
documentation says:

`fragment.count`
----------------------
> Applicable only if the <Merge Strategy> property is set to Defragment.
This attribute indicates how many FlowFiles should be expected in the given
bundle. **At least one FlowFile must have this attribute** in the bundle. If
multiple FlowFiles contain the "fragment.count" attribute in a given bundle,
all must have the same value.

Unfortunately `MergeRecord` doesn't, but it also could be an easy task to
add.

This approach would be a compatible way to achieve what this ticket is about.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] NIFI-14736 Add EOF attribute to ExecuteSQL for the last FlowFile output for a ResultSet [nifi]

Reply via email to