Sharing spark context across multiple spark sql cli initializations

Sadhan Sood Wed, 22 Oct 2014 14:23:44 -0700

We want to run multiple instances of spark sql cli on our yarn cluster.
Each instance of the cli is to be used by a different user. This would be
non-optimal if each user brings up a different cli given how spark works on
yarn by running executor processes (and hence consuming resources) on
worker nodes for the lifetime of the application. Imagine each user trying
to cache a table in memory when there is limited memory across the
cluster.  The right way seems like to use the same spark context shared
across multiple initializations and running just one spark sql application.
Is my understanding correct about resource usage on yarn for spark-sql? Is
there a way to do the sharing of spark context currently ? Seem like it
needs some kind of thrift interface hooked into the cli driver.


*Apologies if you have already seen this on user group*

Sharing spark context across multiple spark sql cli initializations

Reply via email to