Re: Mechanism when doing a select *

2016-03-30 Thread Tale Firefly
Hello guys ! Just a quick thank you again for your answer on this topic ! I noticed that when a job is performed (if the table is bigger than hive.fetch.task.conversion.threshold), then it seems that temporary files are created in HDFS (in /tmp). If I understood well, the select * is divided into

Re: Mechanism when doing a select *

2016-03-22 Thread Tale Firefly
Hello everyone. Thanks for your answers. I'm gonna test this. Best regards. Tale On Mon, Mar 21, 2016 at 10:06 PM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > Hi > > Simple select * query launches a job when the input size is >1Gb by > default. Two configs that determine

Re: Mechanism when doing a select *

2016-03-21 Thread Prasanth Jayachandran
Hi Simple select * query launches a job when the input size is >1Gb by default. Two configs that determines if a job has to be launched hive.fetch.task.conversion hive.fetch.task.conversion.threshold Is your table size >1GB (hive.fetch.task.conversion.threshold)? You can see that from “describ

Re: Mechanism when doing a select *

2016-03-21 Thread Gopal Vijayaraghavan
>> Or does all the data go directly from the datanodes to my client ? Not yet. https://issues.apache.org/jira/browse/HIVE-11527 Cheers, Gopal

Re: Mechanism when doing a select *

2016-03-21 Thread Gopal Vijayaraghavan
> Or does all the data go directly from the datanodes to my client ? Not yet. https://issues.apache.org/jira/browse/HIVE-11527 Cheers, Gopal

Re: Mechanism when doing a select *

2016-03-21 Thread Mich Talebzadeh
You are correct. it should not. There is nothing to optimise here. 0: jdbc:hive2://rhes564:10010/default> *select * from countries;*OK INFO : Compiling command(queryId=hduser_20160321162726_7efeecbb-46ee-431f-9095-f67e0602b318): select * from countries INFO : Semantic Analysis Completed INFO :

Re: Mechanism when doing a select *

2016-03-21 Thread Tale Firefly
Oh my bad, even with the execution engine set to MR, my query turns into a MR job. I'm gonna make more tests with Hive CLI and beeline, and excel to check if this behaviour is linked to the ODBC driver. BR. Tale. On Mon, Mar 21, 2016 at 4:56 PM, Tale Firefly wrote: > Hm, I need to check if st

Re: Mechanism when doing a select *

2016-03-21 Thread Tale Firefly
Hm, I need to check if statistics are enabled for this table and up-to-date. I'm going to check this. I don't know if I was clear in my previous statement, but I am surprised that a job is launched just by doing a select * from my_table. I thought a select * from my_table was not running any MR jo

Re: Mechanism when doing a select *

2016-03-21 Thread Mich Talebzadeh
Well I use Spark as engine. Now the question is have you updated statistics on ORC table? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: Mechanism when doing a select *

2016-03-21 Thread Tale Firefly
Re. Ty ty for your answer. I'm using Tez as execution engine for this query. And it launches a job to yarn. Do you know why it launches a job just for a select when I use Tez as execution engine ? BR. Tale On Mon, Mar 21, 2016 at 4:17 PM, Mich Talebzadeh wrote: > Hi, > > Your query is a ta

Re: Mechanism when doing a select *

2016-03-21 Thread Mich Talebzadeh
Hi, Your query is a table level query that covers all rows in the table. Using ODBC you are connecting to Hive server 2 that runs on a given port. Depending on the version of Hive you are running Hive under the bonnet is most likely using Map-Reduce as the execution engine. Data has to be coll

Mechanism when doing a select *

2016-03-21 Thread Tale Firefly
Hello guys ! I'm trying to understand the mechanism for a simple query select * from my_table when using HiveServer2. I'm using the hortonworks ODBC Driver for HiveServer2. I just do a select * from my_table. my_table is an ORC table based on files divised into blocks located on all my datanodes.