[ https://issues.apache.org/jira/browse/ARROW-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534011#comment-17534011 ]
Todd Farmer commented on ARROW-16035: ------------------------------------- [~jswenson] : Thanks for your review and comments! Here's my thinking: I understand the comment about ResultSet.isLast() being optional only for TYPE_FORWARD_ONLY ResultSets. I see the same note about optional support in the [documentation for isAfterLast()|https://docs.oracle.com/en/java/javase/11/docs/api/java.sql/java/sql/ResultSet.html#isAfterLast()]: {code:java} boolean isAfterLast() throws SQLException Retrieves whether the cursor is after the last row in this ResultSet object. Note:Support for the isAfterLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY {code} While it's possible that a JDBC driver offers support for one of the above methods, but not the other, I considered them effectively equivalent for our purposes. Both are optional in the same context, one is potentially expensive, the other ambiguous in the case of empty results. In the event of a JDBC driver electing to not support these methods, there is no mechanism prior to the first time the underlying ResultSet is read to assess whether an empty ResultSet has been supplied - we literally have to call ResultSet.next() to see what gets returned. That doesn't happen during initialization of the ArrowVectorIterator object - it only happens when ArrowVectorIterator.next() is called. That effectively means that there is no way to support ArrowVectorIterator.hasNext() without first calling ArrowVectorIterator.next() - at least in the context where the JDBC driver vendor has elected to not implement the optional methods. We could elect to have ArrowVectorIterator.hasNext() return true until ArrowVectorIterator.next() has been called, and rely entirely on the added "readComplete" variable to indicate when the ResultSet has been fully read or not. That would result in the following strange behavior, though: {code:java} ArrowVectorIterator iter = JdbcToArrow.sqlToArrowVectorIterator(emptyResultSet, config); boolean willBeTrue = iter.hasNext(); // always true until next() is called VectorSchemaRoot root = iter.next(); // reads ResultSet for first time, discovers it's empty int willBeZero = root.getRowCount(); // no rows! boolean willBeFalse = iter.hasNext(); // now is false after reading ResultSet{code} Effectively, this would make ArrowVectorIterator _always_ return at least one VectorSchemaRoot, even when the underlying ResultSet is empty. I'm not entirely clear whether this is acceptable. While not really related to your overall point, I'm having difficulty identifying a context in which ResultSet.isLast() would be expensive, while ResultSet.isAfterLast() would not be. > [Java] Arrow to JDBC ArrowVectorIterator with does not terminate with empty > result set > -------------------------------------------------------------------------------------- > > Key: ARROW-16035 > URL: https://issues.apache.org/jira/browse/ARROW-16035 > Project: Apache Arrow > Issue Type: Bug > Components: Java > Affects Versions: 7.0.0 > Reporter: Jonathan Swenson > Assignee: Todd Farmer > Priority: Major > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Using an ArrowVectorIterator built from a JDBC Result Set that is empty > causes the iterator to never terminate. > {code:java} > ArrowVectorIterator iterator = > JdbcToArrow.sqlToArrowVectorIterator(conn.createStatement() > .executeQuery("select 1 from table1 where false"), config); {code} > > It appears as though this is due to the implementation of the > [hasNext()|https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L158] > method. > The expectation is that the `isAfterLast()` method on a JDBC result set > return true when the result set is empty. However, according to the [JDBC > documentation|https://docs.oracle.com/en/java/javase/11/docs/api/java.sql/java/sql/ResultSet.html#isAfterLast()] > it will always return false when the result set is empty. > {quote}Returns:{{{}true{}}} if the cursor is after the last row; {{false}} if > the cursor is at any other position or the result set contains no rows > {quote} > -- This message was sent by Atlassian Jira (v8.20.7#820007)