Bad idea. No caching, cluster over consumption... Have a look on instantiating a custom thriftserver on temp tables with fair scheduler to allow concurrent SQL requests. It's not a public API but you can find some examples.
Le 28 oct. 2016 11:12 AM, "Mich Talebzadeh" <mich.talebza...@gmail.com> a écrit : > Hi, > > I think tempTable is private to the session that creates it. In Hive temp > tables created by "CREATE TEMPORARY TABLE" are all private to the session. > Spark is no different. > > The alternative may be everyone creates tempTable from the same DF? > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 28 October 2016 at 10:03, Chanh Le <giaosu...@gmail.com> wrote: > >> Can you elaborate on how to implement "shared sparkcontext and fair >> scheduling" option? >> >> >> It just reuse 1 Spark Context by not letting it stop when the application >> had done. Should check: livy, spark-jobserver >> FAIR https://spark.apache.org/docs/1.2.0/job-scheduling.html just how >> you scheduler your job in the pool but FAIR help you run job in parallel vs >> FIFO (default) 1 job at the time. >> >> >> My approach was to use sparkSession.getOrCreate() method and register >> temp table in one application. However, I was not able to access this >> tempTable in another application. >> >> >> Store metadata in Hive may help but I am not sure about this. >> I use Spark Thrift Server create table on that then let Zeppelin query >> from that. >> >> Regards, >> Chanh >> >> >> >> >> >> On Oct 27, 2016, at 9:01 PM, Victor Shafran <victor.shaf...@equalum.io> >> wrote: >> >> Hi Vincent, >> Can you elaborate on how to implement "shared sparkcontext and fair >> scheduling" option? >> >> My approach was to use sparkSession.getOrCreate() method and register >> temp table in one application. However, I was not able to access this >> tempTable in another application. >> You help is highly appreciated >> Victor >> >> On Thu, Oct 27, 2016 at 4:31 PM, Gene Pang <gene.p...@gmail.com> wrote: >> >>> Hi Mich, >>> >>> Yes, Alluxio is commonly used to cache and share Spark RDDs and >>> DataFrames among different applications and contexts. The data typically >>> stays in memory, but with Alluxio's tiered storage, the "colder" data can >>> be evicted out to other medium, like SSDs and HDDs. Here is a blog post >>> discussing Spark RDDs and Alluxio: https://www.alluxio.c >>> om/blog/effective-spark-rdds-with-alluxio >>> >>> Also, Alluxio also has the concept of an "Under filesystem", which can >>> help you access your existing data across different storage systems. Here >>> is more information about the unified namespace abilities: >>> http://www.alluxio.org/docs/master/en/Unified-and >>> -Transparent-Namespace.html >>> >>> Hope that helps, >>> Gene >>> >>> On Thu, Oct 27, 2016 at 3:39 AM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Thanks Chanh, >>>> >>>> Can it share RDDs. >>>> >>>> Personally I have not used either Alluxio or Ignite. >>>> >>>> >>>> 1. Are there major differences between these two >>>> 2. Have you tried Alluxio for sharing Spark RDDs and if so do you >>>> have any experience you can kindly share >>>> >>>> Regards >>>> >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> On 27 October 2016 at 11:29, Chanh Le <giaosu...@gmail.com> wrote: >>>> >>>>> Hi Mich, >>>>> Alluxio is the good option to go. >>>>> >>>>> Regards, >>>>> Chanh >>>>> >>>>> On Oct 27, 2016, at 5:28 PM, Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>> >>>>> There was a mention of using Zeppelin to share RDDs with many users. >>>>> From the notes on Zeppelin it appears that this is sharing UI and I am not >>>>> sure how easy it is going to be changing the result set with different >>>>> users modifying say sql queries. >>>>> >>>>> There is also the idea of caching RDDs with something like Apache >>>>> Ignite. Has anyone really tried this. Will that work with multiple >>>>> applications? >>>>> >>>>> It looks feasible as RDDs are immutable and so are registered >>>>> tempTables etc. >>>>> >>>>> Thanks >>>>> >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> >> Victor Shafran >> >> VP R&D| Equalum >> >> Mobile: +972-523854883 | Email: victor.shaf...@equalum.io >> >> >> >