[
https://issues.apache.org/jira/browse/HIVE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bennie Schut updated HIVE-1815:
-------------------------------
Attachment: HIVE-1815.1.patch.txt
This is the simplest implementation I could do. Just changed the fetchOne to
fetchN and return the result on each next() call until the list is empty and
then do another fetchN. We've used this for a week and the performance increase
on large resultsets is significant. You could also do the fetchN on a different
thread to keep the queue full but that's a bit more work for just a little more
gain.
I've added 1 small test to call the setFetchSize and getFetchSize
but the jdbc tests should all work like they worked before this test since the
functionality doesn't change.
> The class HiveResultSet should implement batch fetching.
> --------------------------------------------------------
>
> Key: HIVE-1815
> URL: https://issues.apache.org/jira/browse/HIVE-1815
> Project: Hive
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 0.5.0
> Environment: Custom Java application using the Hive JDBC driver to
> connect to a Hive server, execute a Hive query and process the results.
> Reporter: Guy le Mar
> Attachments: HIVE-1815.1.patch.txt
>
>
> When using the Hive JDBC driver, you can execute a Hive query and obtain a
> HiveResultSet instance that contains the results of the query.
> Unfortunately, HiveResultSet can then only fetch a single row of these
> results from the Hive server at a time. As a consequence, it's extremely slow
> to fetch a resultset of anything other than a trivial size.
> It would be nice for the HiveResultSet to be able to fetch N rows from the
> server at a time, so that performance is suitable to support applications
> that provide human interaction.
> (From memory, I think it took me around 20 minutes to fetch 4000 rows.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira