Hi folks
Is it possible to cache a table for shared use across sessions with spark
connect?
I'd like to load a read only table once that many sessions will then
query to improve performance.
This is an example of the kind of thing that I have been trying, but have
not succeeded with.
SparkSession spark =
SparkSession.builder().remote("sc://localhost").getOrCreate();
Dataset<Row> s = spark.read().parquet("/tmp/svampeatlas/*");
// this works if it is not "global"
s.createOrReplaceGlobalTempView("occurrence_svampe");
spark.catalog().cacheTable("occurrence_svampe");
// this fails with a table not found when a global view is used
spark
.sql("SELECT * FROM occurrence_svampe")
.write()
.parquet("/tmp/export");
Thank you
Tim