1. Can the connector fetch or query schemaRDD's saved to Parquet or JSON files? NO 2. Do I need to do something to expose these via hive / metastore other than creating a table in hive? Create a table in spark sql to expose via spark sql 3. Does the thriftserver need to be configured to expose these in some fashion, sort of related to question 2 you would need to configure thrift to read from the metastore you expect it read from - by default it reads from metastore_db directory present in the directory used to launch the thrift server. On 11 Feb 2015 01:35, "Todd Nist" <tsind...@gmail.com> wrote:
> Hi, > > I'm trying to understand how and what the Tableau connector to SparkSQL is > able to access. My understanding is it needs to connect to the > thriftserver and I am not sure how or if it exposes parquet, json, > schemaRDDs, or does it only expose schemas defined in the metastore / hive. > > > For example, I do the following from the spark-shell which generates a > schemaRDD from a csv file and saves it as a JSON file as well as a parquet > file. > > import *org.apache.sql.SQLContext > *import com.databricks.spark.csv._ > val sqlContext = new SQLContext(sc) > val test = > sqlContext.csfFile("/data/test.csv")test.toJSON().saveAsTextFile("/data/out") > test.saveAsParquetFile("/data/out") > > When I connect from Tableau, the only thing I see is the "default" schema > and nothing in the tables section. > > So my questions are: > > 1. Can the connector fetch or query schemaRDD's saved to Parquet or JSON > files? > 2. Do I need to do something to expose these via hive / metastore other > than creating a table in hive? > 3. Does the thriftserver need to be configured to expose these in some > fashion, sort of related to question 2. > > TIA for the assistance. > > -Todd >