Re: Managed to make Hive run on Spark engine

Ashok Kumar Mon, 07 Dec 2015 14:19:50 -0800

This is great news sir. It shows perseverance pays at last.
Can you inform us when the write-up is ready so I can set it up as well please.
I know a bit about the advantages of having Hive using Spark engine. However, 
the general question I have is when one should use Hive on spark as opposed to 
Hive on MapReduce engine?
Thanks again



    On Monday, 7 December 2015, 15:50, Mich Talebzadeh <[email protected]> 
wrote:
 

 #yiv2894763605 #yiv2894763605 -- _filtered #yiv2894763605 
{font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;} _filtered #yiv2894763605 
{panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv2894763605 
{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv2894763605 
#yiv2894763605 p.yiv2894763605MsoNormal, #yiv2894763605 
li.yiv2894763605MsoNormal, #yiv2894763605 div.yiv2894763605MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv2894763605 a:link, 
#yiv2894763605 span.yiv2894763605MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv2894763605 a:visited, #yiv2894763605 
span.yiv2894763605MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv2894763605 
span.yiv2894763605EmailStyle17 {color:windowtext;}#yiv2894763605 
span.yiv2894763605EmailStyle18 {color:windowtext;}#yiv2894763605 
.yiv2894763605MsoChpDefault {font-size:10.0pt;} _filtered #yiv2894763605 
{margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv2894763605 
div.yiv2894763605WordSection1 {}#yiv2894763605 For those interested  From: Mich 
Talebzadeh [mailto:[email protected]] 
Sent: 06 December 2015 20:33
To: [email protected]
Subject: Managed to make Hive run on Spark engine  Thanks all especially to 
Xuefu.for contributions. Finally it works, which means don’t give up until it 
works J  hduser@rhes564::/usr/lib/hive/lib> hiveLogging initialized using 
configuration in 
jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.propertieshive> 
set spark.home= /usr/lib/spark-1.3.1-bin-hadoop2.6;hive> set 
hive.execution.engine=spark;hive> set 
spark.master=spark://50.140.197.217:7077;hive> set 
spark.eventLog.enabled=true;hive> set spark.eventLog.dir= 
/usr/lib/spark-1.3.1-bin-hadoop2.6/logs;hive> set 
spark.executor.memory=512m;hive> set 
spark.serializer=org.apache.spark.serializer.KryoSerializer;hive> set 
hive.spark.client.server.connect.timeout=220000ms;hive> set 
spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;hive> use 
asehadoop;OKTime taken: 0.638 secondshive> select count(1) from t;Query ID = 
hduser_20151206200528_4b85889f-e4ca-41d2-9bd2-1082104be42bTotal jobs = 
1Launching Job 1 out of 1In order to change the average load for a reducer (in 
bytes):  set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the 
maximum number of reducers:  set hive.exec.reducers.max=<number>In order to set 
a constant number of reducers:  set mapreduce.job.reduces=<number>Starting 
Spark Job = c8fee86c-0286-4276-aaa1-2a5eb4e4958a  Query Hive on Spark job[0] 
stages:01  Status: Running (Hive on Spark job[0])Job Progress FormatCurrentTime 
StageId_StageAttemptId: 
SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
[StageCost]2015-12-06 20:05:36,299 Stage-0_0: 0(+1)/1      Stage-1_0: 
0/12015-12-06 20:05:39,344 Stage-0_0: 1/1 Finished Stage-1_0: 0(+1)/12015-12-06 
20:05:40,350 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 FinishedStatus: Finished 
successfully in 8.10 secondsOK  The versions used for this project    OS 
version Linux version 2.6.18-92.el5xen 
([email protected]) (gcc version 4.1.2 20071124 (Red Hat 
4.1.2-41)) #1 SMP Tue Apr 29 13:31:30 EDT 2008  Hadoop 2.6.0Hive 
1.2.1spark-1.3.1-bin-hadoop2.6 (downloaded from prebuild 
spark-1.3.1-bin-hadoop2.6.gz for starting spark standalone cluster)The Jar file 
used in $HIVE_HOME/lib to link Hive to spark was à 
spark-assembly-1.3.1-hadoop2.4.0.jar    (built from the source downloaded as 
zipped file spark-1.3.1.gz and built with command line make-distribution.sh 
--name "hadoop2-without-hive" --tgz 
"-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"  Pretty picky on 
parameters, CLASSPATH, IP addresses or hostname etc to make it work  I will 
create a full guide on how to build and make Hive to run with Spark as its 
engine (as opposed to MR).  HTH  Mich Talebzadeh  Sybase ASE 15 Gold Medal 
Award 2008A Winning Strategy: Running the most Critical Financial Data on ASE 
15http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdfAuthor
 of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 
978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", 
ISBN 978-0-9759693-0-4Publications due shortly:Complex Event Processing in 
Heterogeneous Environments, ISBN: 978-0-9563693-3-8Oracle and Sybase, Concepts 
and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly  
http://talebzadehmich.wordpress.com/  NOTE: The information in this email is 
proprietary and confidential. This message is for the designated recipient 
only, if you are not the intended recipient, you should destroy it immediately. 
Any information in this message shall not be understood as given or endorsed by 
Peridale Technology Ltd, its subsidiaries or their employees, unless expressly 
so stated. It is the responsibility of the recipient to ensure that this email 
is virus free, therefore neither Peridale Ltd, its subsidiaries nor their 
employees accept any responsibility.

Re: Managed to make Hive run on Spark engine

Reply via email to