takraj commented on PR #10102: URL: https://github.com/apache/nifi/pull/10102#issuecomment-3140550513
@exceptionfactory > ... what about another method like isAfterLast()? It can depend on the JDBC implementation, but could be an option to consider. This sounds very risky, since it depends on the driver implementation. And there are also other concerns: *Concern 1:* The result set is driven by the `SqlWriter`, like so: `sqlWriter.writeResultSet(resultSet, out, getLogger(), null)`. This means that cursor operations are delegated to another class, including end of resultset check. This is not the biggest problem, but it might complicate things further. *Concern 2:* I'm not familiar with this entirely, but according to the current implementation, the result set might be fragmented in some way. If we reach the 'end' of the result set, there is a logic to get further results, which can mean we might or might not create further Flow Files. And moreover, at this point we might have already sent out the last Flow File. So this is something, that is pretty difficult to tackle with. Source: https://github.com/apache/nifi/blob/d9b4984b1270f7659660e0ae58a52afe6a44f7e3/nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractExecuteSQL.java#L444 ```java // are there anymore result sets? try { hasResults = st.getMoreResults(Statement.CLOSE_CURRENT_RESULT); hasUpdateCount = st.getUpdateCount() != -1; } catch (SQLException ex) { hasResults = false; hasUpdateCount = false; } ``` Meanwhile, I have realized that, instead of the new EOF attribute, we could reuse the already existing `fragment.count`. The documentation says this is missing only in case of a set `Output Batch Size`. But with my change here, we could add this attribute to the FlowFiles in the very last batch. Practically meaning that at least the last FlowFile will contain information about the total number of fragments. Now, since these `fragment.*` attributes are primarily for `Merge*` processors, if those can work with only one FlowFile having this information, we have already achieved our goal: The FlowFiles from `ExecuteSQL*` can be merged into a single FlowFile, even if `Output Batch Size` is set. And this scenario is already supported by `MergeContent`, since its documentation says: `fragment.count` ---------------------- > Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates how many FlowFiles should be expected in the given bundle. **At least one FlowFile must have this attribute** in the bundle. If multiple FlowFiles contain the "fragment.count" attribute in a given bundle, all must have the same value. Unfortunately `MergeRecord` doesn't, but it also could be an easy task to add. This approach would be a compatible way to achieve what this ticket is about. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
