[GitHub] [spark] juliuszsompolski commented on a change in pull request #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request

GitBox Thu, 10 Oct 2019 02:16:27 -0700

juliuszsompolski commented on a change in pull request #26014: 
[SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
URL: https://github.com/apache/spark/pull/26014#discussion_r333412752


 ##########
 File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 ##########
 @@ -684,6 +685,92 @@ class HiveThriftBinaryServerSuite extends 
HiveThriftJdbcTest {
       
assert(e.getMessage.contains("org.apache.spark.sql.catalyst.parser.ParseException"))
     }
   }
+
+  test("ThriftCLIService FetchResults FETCH_FIRST, FETCH_NEXT, FETCH_PRIOR") {
+    def checkResult(rows: RowSet, start: Long, end: Long): Unit = {
+      assert(rows.getStartOffset() == start)
+      assert(rows.numRows() == end - start)
+      rows.iterator.asScala.zip((start until end).iterator).foreach { case 
(row, v) =>
+        assert(row(0).asInstanceOf[Long] === v)
+      }
+    }
+
+    withCLIServiceClient { client =>
+      val user = System.getProperty("user.name")
+      val sessionHandle = client.openSession(user, "")
+
+      val confOverlay = new java.util.HashMap[java.lang.String, 
java.lang.String]
+      val operationHandle = client.executeStatement(
+        sessionHandle,
+        "SELECT * FROM range(10)",
+        confOverlay) // 10 rows result with sequence 0, 1, 2, ..., 9
+      var rows: RowSet = null
+
+      // Fetch 5 rows with FETCH_NEXT
+      rows = client.fetchResults(
+        operationHandle, FetchOrientation.FETCH_NEXT, 5, 
FetchType.QUERY_OUTPUT)
+      checkResult(rows, 0, 5) // fetched [0, 5)
+
+      // Fetch another 2 rows with FETCH_NEXT
+      rows = client.fetchResults(
+        operationHandle, FetchOrientation.FETCH_NEXT, 2, 
FetchType.QUERY_OUTPUT)
+      checkResult(rows, 5, 7) // fetched [5, 7)
+
+      // FETCH_PRIOR 3 rows
 
 Review comment:
   @wangyum this is expected.
   `FETCH_PRIOR` of the Thriftserver is not the same as FETCH PRIOR in the 
cursor of the client.
   Fetch in Thriftserver operates in batches of rows, and the cursor in the 
client caches these batches and returns results row by row. Let's say it's 
batching by maxRows=100, and we returned row 99. The client has rows [0, 100) 
from the first batch and is at row 99. The next FETCH NEXT on the cursor will 
have to call FETCH_NEXT to the Thriftserver to get a batch of rows [100, 200) 
and return row 100 to the client. Another FETCH NEXT will return row 101 from 
the batch without having to call FETCH_NEXT on the Thriftserver. Another FETCH 
NEXT on the cursor will return row 102. Then FETCH PRIOR will return row 101 
again. Then FETCH PRIOR will return row 100. Only then, another FETCH PRIOR 
should return row 99, but the cursor doesn't have its current batch. Then it 
has to call FETCH_PRIOR on Thriftserver to get rows [0, 99) again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] juliuszsompolski commented on a change in pull request #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request

Reply via email to