1) Cost-based optimization in Hive<https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive> https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive
Calcite is an open source, Apache Licensed, query planning and execution framework. Many pieces of Calcite are derived from Eigenbase Project. Calcite has optional JDBC server, query parser and validator, query optimizer and pluggable data source adapters. One of the available Calcite optimizer is a cost based optimizer based on volcano paper. 2) The Volcano Optimizer Generator: Extensibility and Efficient Search Goetz Graefe, Portland State University William J. McKenna, University of Colorado at Boulder From Proc. IEEE Conf. on Data Eng., Vienna, April 1993, p. 209. 2.2. Optimizer Generator Input and Optimizer Operation … The user queries to be optimized by a generated optimizer are specified as an algebra expression (tree) of logical operators. The translation from a user interface into a logical algebra expression must be performed by the parser and is not discussed here. … 3) Abstract syntax tree From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Abstract_syntax_tree In computer science<https://en.wikipedia.org/wiki/Computer_science>, an abstract syntax tree (AST), or just syntax tree, is a tree<https://en.wikipedia.org/wiki/Directed_tree> representation of the abstract syntactic<https://en.wikipedia.org/wiki/Abstract_syntax> structure of source code<https://en.wikipedia.org/wiki/Source_code> written in a programming language<https://en.wikipedia.org/wiki/Programming_language>. From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Tuesday, June 14, 2016 7:58 PM To: user <user@hive.apache.org> Subject: Re: Optimized Hive query Amazing. that is the first time I have heard that an optimizer does not have the concept of flattened query? So what is the definition of syntax tree? Are you referring to the industry notation "access path". This is the first time I have heard of such notation called syntax tree. Are you stating that there is somehow some explanation for optimiser "access path" that comes out independent of the optimizer and is called syntax tree? Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/> On 14 June 2016 at 17:46, Markovitz, Dudu <dmarkov...@paypal.com<mailto:dmarkov...@paypal.com>> wrote: It’s not the query that is being optimized but the syntax tree that is created upon the query (execute “explain extended select …”) In no point do we have a “flattened query” Dudu From: Aviral Agarwal [mailto:aviral12...@gmail.com<mailto:aviral12...@gmail.com>] Sent: Tuesday, June 14, 2016 10:37 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: Optimized Hive query Hi, Thanks for the replies. I already knew that the optimizer already does that. My usecase is a bit different though. I want to display the flattened query back to the user. So I was hoping of using internal Hive CBO to somehow change the AST generated for the query somehow. Thanks, Aviral On Tue, Jun 14, 2016 at 12:42 PM, Gopal Vijayaraghavan <gop...@apache.org<mailto:gop...@apache.org>> wrote: > You can see that you get identical execution plans for the nested query >and the flatten one. Wasn't that always though. Back when I started with Hive, before Stinger, it didn't have the identity project remover. To know if your version has this fix, try looking at hive> set hive.optimize.remove.identity.project; Cheers, Gopal