Re: Hive on Spark performance

ws Mon, 14 Mar 2016 07:26:34 -0700

Hive 1.2.1.2.3.4.0-3485Spark 1.5.2Oracle Database 11g Enterprise Edition 
Release 11.2.0.4.0 - 64bit Production
### SELECT  f.description, f.item_number, sum(f.df_a * (select count(1) from 
e.mv_A_h_a where hb_h_name = r.h_id)) as df_aFROM e.eng_fac_atl_sc_bf_qty f, 
wv_ATL_2_qty_df_rates rwhere f.item_number NOT LIKE 'HR%' AND f.item_number NOT 
LIKE 'UG%' AND f.item_number NOT LIKE 'DEV%'group by  f.description, 
f.item_number###
This query works fine in oracle but not Hive or Spark.So the problem is: 
"sum(f.df_a * (select count(1) from e.mv_A_h_a where hb_h_name = r.h_id)) as 
df_a" field.


Thanks,Wlodek-- 

    On Sunday, March 13, 2016 7:36 PM, Mich Talebzadeh 
<[email protected]> wrote:
 

 Depending on the version of Hive on Spark engine. 
As far as I am aware the latest version of Hive that I am using (Hive 2) has 
improvements compared to the previous versions of Hive (0.14,1.2.1) on Spark 
engine.
As of today I have managed to use Hive 2.0 on Spark version 1.3.1. So it is not 
the latest Spark but it is pretty good.
What specific concerns do you have in mind?
HTH

Dr Mich Talebzadeh LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 http://talebzadehmich.wordpress.com 
On 13 March 2016 at 23:27, sjayatheertha <[email protected]> wrote:

Just curious if you could share your experience on the performance of spark in 
your company? How much data do you process? And what's the latency you are 
getting with spark engine?

Vidya

Re: Hive on Spark performance

Reply via email to