Thanks a lot for your reply .
In effect , here we tried to run the sql on kettle, hive and spark hive (by 
HiveContext) respectively, the job seems frozen  to finish to run .
In the 6 tables , need to respectively read the different columns in different 
tables for specific information , then do some simple calculation before output 
. join operation is used most in the sql . 
Best wishes! 

 

    On Monday, July 18, 2016 6:24 PM, Chanh Le <giaosu...@gmail.com> wrote:
 

 Hi,What about the network (bandwidth) between hive and spark? Does it run in 
Hive before then you move to Spark?Because It's complex you can use something 
like EXPLAIN command to show what going on.



 
On Jul 18, 2016, at 5:20 PM, Zhiliang Zhu <zchl.j...@yahoo.com.INVALID> wrote:
the sql logic in the program is very much complex , so do not describe the 
detailed codes   here .  

    On Monday, July 18, 2016 6:04 PM, Zhiliang Zhu 
<zchl.j...@yahoo.com.INVALID> wrote:
 

 Hi All,  
Here we have one application, it needs to extract different columns from 6 hive 
tables, and then does some easy calculation, there is around 100,000 number of 
rows in each table,finally need to output another table or file (with format of 
consistent columns) .
 However, after lots of days trying, the spark hive job is unthinkably slow - 
sometimes almost frozen. There is 5 nodes for spark cluster.  Could anyone 
offer some help, some idea or clue is also good. 
Thanks in advance~
Zhiliang 

   



  

Reply via email to