Re: Spark Thrift Server performance

ayan guha Wed, 13 Jul 2016 09:09:44 -0700

Not really, that is not the primary intention. Our main goal is poor man's
high availability (as STS does not provide HA mechanism like HS2) :).
Additionally, we have made STS part of Ambari AUTO_START group, so Ambari
brings up STS if it goes down for some intermittent reason.




On Thu, Jul 14, 2016 at 1:38 AM, Michael Segel <msegel_had...@hotmail.com>
wrote:

> Hey, silly question?
>
> If you’re running a load balancer, are you trying to reuse the RDDs
> between jobs?
>
> TIA
> -Mike
>
> On Jul 13, 2016, at 9:08 AM, ayan guha <guha.a...@gmail.com> wrote:
>
> My 2 cents:
>
> Yes, we are running multiple STS (we are running on different nodes, but
> you can run on same node, different ports). Using Ambari, it is really
> convenient to manage.
>
> We have set up a nginx load balancer as well pointing to both services and
> all our external BI tools connect to the load balancer.
>
> STS works as an YARN Client application, where STS is the driver.
>
>
>
> On Wed, Jul 13, 2016 at 5:33 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi,
>>
>> I need some feedback on the performance of the Spark Thrift Server (STS)
>>
>> As far I can ascertain one can start STS passing the usual spark
>> parameters
>>
>> ${SPARK_HOME}/sbin/start-thriftserver.sh \
>>                 --master spark://50.140.197.217:7077 \
>>                 --hiveconf hive.server2.thrift.port=10055 \
>>                 --packages <PACKAGES> \
>>                 --driver-memory 2G \
>>                 --num-executors 2 \
>>                 --executor-memory 2G \
>>                 --conf "spark.scheduler.mode=FAIR" \
>>                 --conf
>> "spark.executor.extraJavaOptions=-XX:+PrintGCDetails
>> -XX:+PrintGCTimeStamps" \
>>                 --jars <JAR_LIST> \
>>                 --conf "spark.ui.port=12345"
>>
>>
>>   And accessing it via beeline JDBC client
>>
>> beeline -u jdbc:hive2://rhes564:10055 -n hduser -p
>>
>> Now the questions I have
>>
>>
>>    1. What is the limit on the number of users accessing the thrift
>>    server.
>>    2. Clearly the thrift server can start with resource configuration.
>>    In a simple way does STS act as a gateway to Spark (meaning Spark apps can
>>    use their own resources) or one is limited to resource that STS offers?
>>    3. Can one start multiple thrift servers
>>
>> As far as I can see STS is equivalent to Spark SQL accessing Hive DW.
>> Indeed this is what it says:
>>
>> Connecting to jdbc:hive2://rhes564:10055
>> Connected to: Spark SQL (version 1.6.1)
>> Driver: Spark Project Core (version 1.6.1)
>> Transaction isolation: TRANSACTION_REPEATABLE_READ
>> Beeline version 1.6.1 by Apache Hive
>> 0: jdbc:hive2://rhes564:10055>
>>
>> Thanks
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>
>
>


-- 
Best Regards,
Ayan Guha

Re: Spark Thrift Server performance

Reply via email to