Hi AFAIK STS uses Spark SQL and not Map Reduce. Is that not correct?
Best Ayan On Wed, Sep 14, 2016 at 8:51 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > STS will rely on Hive execution engine. My Hive uses Spark execution > engine so STS will pass the SQL to Hive and let it do the work and return > the result set > > which beeline > /usr/lib/spark-2.0.0-bin-hadoop2.6/bin/beeline > ${SPARK_HOME}/bin/beeline -u jdbc:hive2://rhes564:10055 -n hduser -p > xxxxxxxx > Connecting to jdbc:hive2://rhes564:10055 > Connected to: Spark SQL (version 2.0.0) > Driver: Hive JDBC (version 1.2.1.spark2) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 1.2.1.spark2 by Apache Hive > 0: jdbc:hive2://rhes564:10055> > > jdbc:hive2://rhes564:10055> select count(1) from test.prices; > Ok I did a simple query in STS, You will this in hive.log > > 2016-09-13T23:44:50,996 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 > get_database: test > 2016-09-13T23:44:50,996 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_database: test > 2016-09-13T23:44:50,998 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table : > db=test tbl=prices > 2016-09-13T23:44:50,998 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_table : db=test > tbl=prices > 2016-09-13T23:44:51,007 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table : > db=test tbl=prices > 2016-09-13T23:44:51,007 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_table : db=test > tbl=prices > 2016-09-13T23:44:51,021 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 > get_database: test > 2016-09-13T23:44:51,021 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_database: test > 2016-09-13T23:44:51,023 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table : > db=test tbl=prices > 2016-09-13T23:44:51,023 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_table : db=test > tbl=prices > 2016-09-13T23:44:51,029 INFO [pool-4-thread-4]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table : > db=test tbl=prices > 2016-09-13T23:44:51,029 INFO [pool-4-thread-4]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser > ip=50.140.197.217 cmd=source:50.140.197.217 get_table : db=test > tbl=prices > > I think it is a good idea to switch to Spark engine (as opposed to MR). My > tests proved that Hive on Spark using DAG and in-memory offering runs at > least by order of magnitude faster compared to map-reduce. > > You can either connect to beeline from $HIVE_HOME/... or beeline from > $SPARK_HOME > > HTH > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 13 September 2016 at 23:28, Benjamin Kim <bbuil...@gmail.com> wrote: > >> Mich, >> >> It sounds like that there would be no harm in changing then. Are you >> saying that using STS would still use MapReduce to run the SQL statements? >> What our users are doing in our CDH 5.7.2 installation is changing the >> execution engine to Spark when connected to HiveServer2 to get faster >> results. Would they still have to do this using STS? Lastly, we are seeing >> zombie YARN jobs left behind even after a user disconnects. Are you seeing >> this happen with STS? If not, then this would be even better. >> >> Thanks for your fast reply. >> >> Cheers, >> Ben >> >> On Sep 13, 2016, at 3:15 PM, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >> Hi, >> >> Spark Thrift server (STS) still uses hive thrift server. If you look at >> $SPARK_HOME/sbin/start-thriftserver.sh you will see (mine is Spark 2) >> >> function usage { >> echo "Usage: ./sbin/start-thriftserver [options] [thrift server >> options]" >> pattern="usage" >> *pattern+="\|Spark assembly has been built with Hive"* >> pattern+="\|NOTE: SPARK_PREPEND_CLASSES is set" >> pattern+="\|Spark Command: " >> pattern+="\|=======" >> pattern+="\|--help" >> >> >> Indeed when you start STS, you pass hiveconf parameter to it >> >> ${SPARK_HOME}/sbin/start-thriftserver.sh \ >> --master \ >> --hiveconf hive.server2.thrift.port=10055 \ >> >> and STS bypasses Spark optimiser and uses Hive optimizer and execution >> engine. You will see this in hive.log file >> >> So I don't think it is going to give you much difference. Unless they >> have recently changed the design of STS. >> >> HTH >> >> >> >> >> Dr Mich Talebzadeh >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> http://talebzadehmich.wordpress.com >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> On 13 September 2016 at 22:32, Benjamin Kim <bbuil...@gmail.com> wrote: >> >>> Does anyone have any thoughts about using Spark SQL Thriftserver in >>> Spark 1.6.2 instead of HiveServer2? We are considering abandoning >>> HiveServer2 for it. Some advice and gotcha’s would be nice to know. >>> >>> Thanks, >>> Ben >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> >> > -- Best Regards, Ayan Guha