There's an annoying small usability issue in HiveContext. By default, it creates a local metastore which forbids other processes using HiveContext to be launched from the same directory.
How can I make the metastore local to each HiveContext? Is there an in-memory metastore configuration? /tmp/xxxx temp folders is one solution, but it's not elegant and I still need to clean up the files... I can add hive-site.xml and use a shared metastore, however they'll still operate in the same catalog space. (Simple) SQLContext by default uses in-memory catalog which is bound to each context. Since HiveContext is a subclass, we should make the same semantics as default. Make sense? Spark is very much functional and shared nothing, these are wonderful features. Let's not have something global as a dependency. Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/