actually it should be something like getHandleIdentifier()=hfhkjhfjhkjfh-dsdsad-sdsd--dsada: fetchResults()
On Wed, Aug 19, 2015 at 3:49 PM, Prem Yadav <[email protected]> wrote: > Hi Emil, > for either of the queries, there will be no mapreduce job. the query > engine understands that in both case, it need not do any computation and > just needs to fetch all the data from the files. > > The fetch size should be honored in both cases. Hope you are using > hiveserver2. > You can try connections using excel and cloudera's odbc driver with the > required parameters for your testing. For each batch that hive returns, you > should be able to see in hive lg something like: returning results for id > <hash> > > On Wed, Aug 19, 2015 at 2:54 PM, Emil Berglind <[email protected]> > wrote: > >> I have a small Java app that I wrote that uses JDBC to run a hive query. >> The Hive table that I'm running it against has 30+ million rows, and I want >> to pull them all back to verify the data. If I run a simple "SELECT * FROM >> <table>" and set a fetch size of 30,000 then the fetch size is not honored >> and it seems to want to bring back all 30+ million rows at once, which is >> definitely not going to work. If I set a LIMIT on the SQL, like "SELECT * >> FROM <table> LIMIT 9999999", then it honors the fetch size just fine. >> However, when I set the LIMIT on there, it does not run as a map reduce job >> but rather seems to stream the data back. Is this how it's supposed to >> work? I'm new to the Hadoop eco-system and I'm really just trying to figure >> out what the best way to bring this data back in chunks is. Maybe I'm going >> about this all wrong? >> > >
