So, after toying around a bit, here's what I ended up with. First off,
there's no function registerTempTable -- registerTable seems to be
enough to work (it's the same whether directly on a SchemaRDD or on a
SqlContext being passed an RDD). The problem I encountered after was
reloading a table in one actor and referencing it another.
The environment I had set has 2 types of Akka actors, a Query and a
Refresher. They share a reference (passed in on creation via
Props(classOf[Actor], sqlContext). The Refresher would simply reload the
parquet file and refresh the table:
sqlContext
.parquetFile(dataDir)
.registerAsTable(tableName)
The WebService would query it:
sqlContext.sql(query with tableName).collect()
This would break, the Refresher actor would work and be able to query, but
the Query actor would return that the table doesn't exist.
I now removed the Refresher and just updated the Query actor to refresh its
table if it's stale.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-and-running-parquet-tables-tp13987p14102.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org