This is great news sir. It shows perseverance pays at last.
Can you inform us when the write-up is ready so I can set it up as well please.
I know a bit about the advantages of having Hive using Spark engine. However,
the general question I have is when one should use Hive on spark as opposed to
Hive on MapReduce engine?
Thanks again
On Monday, 7 December 2015, 15:50, Mich Talebzadeh <[email protected]>
wrote:
#yiv2894763605 #yiv2894763605 -- _filtered #yiv2894763605
{font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;} _filtered #yiv2894763605
{panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv2894763605
{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv2894763605
#yiv2894763605 p.yiv2894763605MsoNormal, #yiv2894763605
li.yiv2894763605MsoNormal, #yiv2894763605 div.yiv2894763605MsoNormal
{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv2894763605 a:link,
#yiv2894763605 span.yiv2894763605MsoHyperlink
{color:blue;text-decoration:underline;}#yiv2894763605 a:visited, #yiv2894763605
span.yiv2894763605MsoHyperlinkFollowed
{color:purple;text-decoration:underline;}#yiv2894763605
span.yiv2894763605EmailStyle17 {color:windowtext;}#yiv2894763605
span.yiv2894763605EmailStyle18 {color:windowtext;}#yiv2894763605
.yiv2894763605MsoChpDefault {font-size:10.0pt;} _filtered #yiv2894763605
{margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv2894763605
div.yiv2894763605WordSection1 {}#yiv2894763605 For those interested From: Mich
Talebzadeh [mailto:[email protected]]
Sent: 06 December 2015 20:33
To: [email protected]
Subject: Managed to make Hive run on Spark engine Thanks all especially to
Xuefu.for contributions. Finally it works, which means don’t give up until it
works J hduser@rhes564::/usr/lib/hive/lib> hiveLogging initialized using
configuration in
jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.propertieshive>
set spark.home= /usr/lib/spark-1.3.1-bin-hadoop2.6;hive> set
hive.execution.engine=spark;hive> set
spark.master=spark://50.140.197.217:7077;hive> set
spark.eventLog.enabled=true;hive> set spark.eventLog.dir=
/usr/lib/spark-1.3.1-bin-hadoop2.6/logs;hive> set
spark.executor.memory=512m;hive> set
spark.serializer=org.apache.spark.serializer.KryoSerializer;hive> set
hive.spark.client.server.connect.timeout=220000ms;hive> set
spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;hive> use
asehadoop;OKTime taken: 0.638 secondshive> select count(1) from t;Query ID =
hduser_20151206200528_4b85889f-e4ca-41d2-9bd2-1082104be42bTotal jobs =
1Launching Job 1 out of 1In order to change the average load for a reducer (in
bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the
maximum number of reducers: set hive.exec.reducers.max=<number>In order to set
a constant number of reducers: set mapreduce.job.reduces=<number>Starting
Spark Job = c8fee86c-0286-4276-aaa1-2a5eb4e4958a Query Hive on Spark job[0]
stages:01 Status: Running (Hive on Spark job[0])Job Progress FormatCurrentTime
StageId_StageAttemptId:
SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount
[StageCost]2015-12-06 20:05:36,299 Stage-0_0: 0(+1)/1 Stage-1_0:
0/12015-12-06 20:05:39,344 Stage-0_0: 1/1 Finished Stage-1_0: 0(+1)/12015-12-06
20:05:40,350 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 FinishedStatus: Finished
successfully in 8.10 secondsOK The versions used for this project OS
version Linux version 2.6.18-92.el5xen
([email protected]) (gcc version 4.1.2 20071124 (Red Hat
4.1.2-41)) #1 SMP Tue Apr 29 13:31:30 EDT 2008 Hadoop 2.6.0Hive
1.2.1spark-1.3.1-bin-hadoop2.6 (downloaded from prebuild
spark-1.3.1-bin-hadoop2.6.gz for starting spark standalone cluster)The Jar file
used in $HIVE_HOME/lib to link Hive to spark was à
spark-assembly-1.3.1-hadoop2.4.0.jar (built from the source downloaded as
zipped file spark-1.3.1.gz and built with command line make-distribution.sh
--name "hadoop2-without-hive" --tgz
"-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided" Pretty picky on
parameters, CLASSPATH, IP addresses or hostname etc to make it work I will
create a full guide on how to build and make Hive to run with Spark as its
engine (as opposed to MR). HTH Mich Talebzadeh Sybase ASE 15 Gold Medal
Award 2008A Winning Strategy: Running the most Critical Financial Data on ASE
15http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdfAuthor
of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN
978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices",
ISBN 978-0-9759693-0-4Publications due shortly:Complex Event Processing in
Heterogeneous Environments, ISBN: 978-0-9563693-3-8Oracle and Sybase, Concepts
and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
http://talebzadehmich.wordpress.com/ NOTE: The information in this email is
proprietary and confidential. This message is for the designated recipient
only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by
Peridale Technology Ltd, its subsidiaries or their employees, unless expressly
so stated. It is the responsibility of the recipient to ensure that this email
is virus free, therefore neither Peridale Ltd, its subsidiaries nor their
employees accept any responsibility.