You can search for last month's mailing list with "Do you have more suggestions on when to use Hive on MapReduce or Hive on Spark?" I hope for you a little help.
Best wishes. 2015-12-08 6:18 GMT+08:00 Ashok Kumar <ashok34...@yahoo.com>: > > This is great news sir. It shows perseverance pays at last. > > Can you inform us when the write-up is ready so I can set it up as well > please. > > I know a bit about the advantages of having Hive using Spark engine. > However, the general question I have is when one should use Hive on spark > as opposed to Hive on MapReduce engine? > > Thanks again > > > > > On Monday, 7 December 2015, 15:50, Mich Talebzadeh <m...@peridale.co.uk> > wrote: > > > For those interested > > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk] > *Sent:* 06 December 2015 20:33 > *To:* user@hive.apache.org > *Subject:* Managed to make Hive run on Spark engine > > Thanks all especially to Xuefu.for contributions. Finally it works, which > means don’t give up until it works J > > hduser@rhes564::/usr/lib/hive/lib> hive > Logging initialized using configuration in > jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties > *hive> set spark.home= /usr/lib/spark-1.3.1-bin-hadoop2.6;* > *hive> set hive.execution.engine=spark;* > *hive> set spark.master=spark://50.140.197.217:7077 > <http://50.140.197.217:7077>;* > *hive> set spark.eventLog.enabled=true;* > *hive> set spark.eventLog.dir= /usr/lib/spark-1.3.1-bin-hadoop2.6/logs;* > *hive> set spark.executor.memory=512m;* > *hive> set spark.serializer=org.apache.spark.serializer.KryoSerializer;* > *hive> set hive.spark.client.server.connect.timeout=220000ms;* > *hive> set > spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;* > hive> use asehadoop; > OK > Time taken: 0.638 seconds > hive> *select count(1) from t;* > Query ID = hduser_20151206200528_4b85889f-e4ca-41d2-9bd2-1082104be42b > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapreduce.job.reduces=<number> > Starting Spark Job = c8fee86c-0286-4276-aaa1-2a5eb4e4958a > > Query Hive on Spark job[0] stages: > 0 > 1 > > Status: Running (Hive on Spark job[0]) > Job Progress Format > CurrentTime StageId_StageAttemptId: > SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount > [StageCost] > 2015-12-06 20:05:36,299 Stage-0_0: 0(+1)/1 Stage-1_0: 0/1 > 2015-12-06 20:05:39,344 Stage-0_0: 1/1 Finished Stage-1_0: 0(+1)/1 > 2015-12-06 20:05:40,350 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 Finished > Status: Finished successfully in 8.10 seconds > OK > > The versions used for this project > > > OS version Linux version 2.6.18-92.el5xen ( > brewbuil...@ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 > (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:31:30 EDT 2008 > > Hadoop 2.6.0 > Hive 1.2.1 > spark-1.3.1-bin-hadoop2.6 (downloaded from prebuild > spark-1.3.1-bin-hadoop2.6.gz > for starting spark standalone cluster) > The Jar file used in $HIVE_HOME/lib to link Hive to spark was à > spark-assembly-1.3.1-hadoop2.4.0.jar > (built from the source downloaded as zipped file spark-1.3.1.gz and > built with command line make-distribution.sh --name > "hadoop2-without-hive" --tgz > "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided" > > Pretty picky on parameters, CLASSPATH, IP addresses or hostname etc to > make it work > > I will create a full guide on how to build and make Hive to run with Spark > as its engine (as opposed to MR). > > HTH > > Mich Talebzadeh > > *Sybase ASE 15 Gold Medal Award 2008* > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE > 15", ISBN 978-0-9563693-0-7*. > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > *Publications due shortly:* > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > http://talebzadehmich.wordpress.com/ > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees > accept any responsibility. > > > > > >