[SPARK SQL] Difference between 'Hive on spark' and Spark SQL

luby Wed, 19 Dec 2018 23:17:30 -0800

Hi, All,

We are starting to migrate our data to Hadoop platform in hoping to use 
'Big Data' technologies to 
improve our business.


We are new in the area and want to get some help from you.

Currently all our data is put into Hive and some complicated SQL query 
statements are run daily.

We want to improve the performance of these queries and have two options 
at hand: 
a. Turn on 'Hive on spark' feature and run HQLs and
b. Run those query statements with spark SQL

What the difference between these options?

Another question is:
There is a hive setting 'hive.optimze.ppd' to enable 'predicated pushdown' 
query optimize
Is ther equivalent option in spark sql or the same setting also works for 
spark SQL?

Thanks in advance

Boying


 
本邮件内容包含保密信息。如阁下并非拟发送的收件人，请您不要阅读、保存、对外
披露或复制本邮件的任何内容，或者打开本邮件的任何附件。请即回复邮件告知发件
人，并立刻将该邮件及其附件从您的电脑系统中全部删除，不胜感激。

 
This email message may contain confidential and/or privileged information. 
If you are not the intended recipient, please do not read, save, forward, 
disclose or copy the contents of this email or open any file attached to 
this email. We will be grateful if you could advise the sender immediately 
by replying this email, and delete this email and any attachment or links 
to this email completely and immediately from your computer system.

[SPARK SQL] Difference between 'Hive on spark' and Spark SQL

Reply via email to