Hi all,
I am executing the TPC-H [1] queries on Hive and I need help to understand
if Hive execute some stages locally. The TPC-H Query-16 [2] is translated
to three HiveQL queries, and the EXPLAIN [3] of each of these HiveQL
queries show me that the first query has 8 stages, the second query has 6
and the last has 4 stages. However, only 5 stages were submitted to Hadoop.
I think Hive does not submit some stages to Hadoop once these stages are
"internal" Hive operations like renaming tables, but I am not sure.
Would you please help me to understand what Hive does internally with the
stages? Does Hive execute some stages locally/at the master node? Why some
stages are not sent to Hadoop?
Thanks in advance,
Edson Ramiro
[1] https://issues.apache.org/jira/browse/HIVE-600
[2] http://www.inf.ufpr.br/erlfilho/q16_parts_supplier_relationship.hive.txt
[3]
http://www.inf.ufpr.br/erlfilho/q16_parts_supplier_relationship.explain.txt