Thanks Ted, this is exactly what Spark SQL LocalHiveContext does. To
make an embedded metastore local to a single HiveContext, we must
allocate different Derby database directories for each HiveContext, and
Jianshi is also trying to avoid that.
On 10/27/14 9:44 PM, Ted Yu wrote:
Please see
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-EmbeddedMetastore
Cheers
On Oct 27, 2014, at 6:20 AM, Cheng Lian <lian.cs....@gmail.com
<mailto:lian.cs....@gmail.com>> wrote:
I have never tried this yet, but maybe you can use an in-memory Derby
database as metastore
https://db.apache.org/derby/docs/10.7/devguide/cdevdvlpinmemdb.html
I'll investigate this when free, guess we can use this for Spark SQL
Hive support testing.
On 10/27/14 4:38 PM, Jianshi Huang wrote:
There's an annoying small usability issue in HiveContext.
By default, it creates a local metastore which forbids other
processes using HiveContext to be launched from the same directory.
How can I make the metastore local to each HiveContext? Is there an
in-memory metastore configuration? /tmp/xxxx temp folders is one
solution, but it's not elegant and I still need to clean up the files...
I can add hive-site.xml and use a shared metastore, however they'll
still operate in the same catalog space.
(Simple) SQLContext by default uses in-memory catalog which is bound
to each context. Since HiveContext is a subclass, we should make the
same semantics as default. Make sense?
Spark is very much functional and shared nothing, these are
wonderful features. Let's not have something global as a dependency.
Cheers,
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: user-h...@spark.apache.org
<mailto:user-h...@spark.apache.org>