Re: Spark SQL Thriftserver

Mich Talebzadeh Tue, 13 Sep 2016 23:23:44 -0700

Actually this is what it says

Connecting to jdbc:hive2://rhes564:10055
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive


So it uses Spark SQL. However, they do not seem to have upgraded Beeline
version from 1.2.1

HTH

It is a useful tool with Zeppelin.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 14 September 2016 at 00:55, ayan guha <guha.a...@gmail.com> wrote:

> Hi
>
> AFAIK STS uses Spark SQL and not Map Reduce. Is that not correct?
>
> Best
> Ayan
>
> On Wed, Sep 14, 2016 at 8:51 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> STS will rely on Hive execution engine. My Hive uses Spark execution
>> engine so STS will pass the SQL to Hive and let it do the work and return
>> the result set
>>
>>  which beeline
>> /usr/lib/spark-2.0.0-bin-hadoop2.6/bin/beeline
>> ${SPARK_HOME}/bin/beeline -u jdbc:hive2://rhes564:10055 -n hduser -p
>> xxxxxxxx
>> Connecting to jdbc:hive2://rhes564:10055
>> Connected to: Spark SQL (version 2.0.0)
>> Driver: Hive JDBC (version 1.2.1.spark2)
>> Transaction isolation: TRANSACTION_REPEATABLE_READ
>> Beeline version 1.2.1.spark2 by Apache Hive
>> 0: jdbc:hive2://rhes564:10055>
>>
>> jdbc:hive2://rhes564:10055> select count(1) from test.prices;
>> Ok I did a simple query in STS, You will this in hive.log
>>
>> 2016-09-13T23:44:50,996 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217
>> get_database: test
>> 2016-09-13T23:44:50,996 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_database: test
>> 2016-09-13T23:44:50,998 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table :
>> db=test tbl=prices
>> 2016-09-13T23:44:50,998 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_table : db=test
>> tbl=prices
>> 2016-09-13T23:44:51,007 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table :
>> db=test tbl=prices
>> 2016-09-13T23:44:51,007 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_table : db=test
>> tbl=prices
>> 2016-09-13T23:44:51,021 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217
>> get_database: test
>> 2016-09-13T23:44:51,021 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_database: test
>> 2016-09-13T23:44:51,023 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table :
>> db=test tbl=prices
>> 2016-09-13T23:44:51,023 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_table : db=test
>> tbl=prices
>> 2016-09-13T23:44:51,029 INFO  [pool-4-thread-4]: metastore.HiveMetaStore
>> (HiveMetaStore.java:logInfo(670)) - 4: source:50.140.197.217 get_table :
>> db=test tbl=prices
>> 2016-09-13T23:44:51,029 INFO  [pool-4-thread-4]: HiveMetaStore.audit
>> (HiveMetaStore.java:logAuditEvent(280)) - ugi=hduser
>> ip=50.140.197.217       cmd=source:50.140.197.217 get_table : db=test
>> tbl=prices
>>
>> I think it is a good idea to switch to Spark engine (as opposed to MR).
>> My tests proved that Hive on Spark using DAG and in-memory offering runs at
>> least by order of magnitude faster compared to map-reduce.
>>
>> You can either connect to beeline from $HIVE_HOME/... or beeline from
>> $SPARK_HOME
>>
>> HTH
>>
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 13 September 2016 at 23:28, Benjamin Kim <bbuil...@gmail.com> wrote:
>>
>>> Mich,
>>>
>>> It sounds like that there would be no harm in changing then. Are you
>>> saying that using STS would still use MapReduce to run the SQL statements?
>>> What our users are doing in our CDH 5.7.2 installation is changing the
>>> execution engine to Spark when connected to HiveServer2 to get faster
>>> results. Would they still have to do this using STS? Lastly, we are seeing
>>> zombie YARN jobs left behind even after a user disconnects. Are you seeing
>>> this happen with STS? If not, then this would be even better.
>>>
>>> Thanks for your fast reply.
>>>
>>> Cheers,
>>> Ben
>>>
>>> On Sep 13, 2016, at 3:15 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> Spark Thrift server (STS) still uses hive thrift server. If you look at
>>> $SPARK_HOME/sbin/start-thriftserver.sh you will see (mine is Spark 2)
>>>
>>> function usage {
>>>   echo "Usage: ./sbin/start-thriftserver [options] [thrift server
>>> options]"
>>>   pattern="usage"
>>>   *pattern+="\|Spark assembly has been built with Hive"*
>>>   pattern+="\|NOTE: SPARK_PREPEND_CLASSES is set"
>>>   pattern+="\|Spark Command: "
>>>   pattern+="\|======="
>>>   pattern+="\|--help"
>>>
>>>
>>> Indeed when you start STS, you pass hiveconf parameter to it
>>>
>>> ${SPARK_HOME}/sbin/start-thriftserver.sh \
>>>                 --master  \
>>>                 --hiveconf hive.server2.thrift.port=10055 \
>>>
>>> and STS bypasses Spark optimiser and uses Hive optimizer and execution
>>> engine. You will see this in hive.log file
>>>
>>> So I don't think it is going to give you much difference. Unless they
>>> have recently changed the design of STS.
>>>
>>> HTH
>>>
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 13 September 2016 at 22:32, Benjamin Kim <bbuil...@gmail.com> wrote:
>>>
>>>> Does anyone have any thoughts about using Spark SQL Thriftserver in
>>>> Spark 1.6.2 instead of HiveServer2? We are considering abandoning
>>>> HiveServer2 for it. Some advice and gotcha’s would be nice to know.
>>>>
>>>> Thanks,
>>>> Ben
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>
>>>>
>>>
>>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>

Re: Spark SQL Thriftserver

Reply via email to