Hi Silvio, Ah, I like that, there is a section in Tableau for "Initial SQL" to be executed upon connecting this would fit well there. I guess I will need to issue a collect(), coalesce(1,true).saveAsTextFile(...) or use repartition(1), as the file currently is being broken into multiple parts. While this works in the spark-shell:
val test = sqlContext.jsonFile("/data/out/“) // returs all parts back as one It seems to fail in just spark-sql: create temporary table test using org.apache.spark.sql.json options (path '/data/out/') cache table test with: [Simba][SparkODBC] (35) Error from Spark: error code: '0' error message: 'org.apache.spark.sql.hive.HiveQl$ParseException: Failed to parse: create temporary table test using org.apache.spark.sql.json options (path '/data/out/') cache table test'. Initial SQL Error. Check that the syntax is correct and that you have access privileges to the requested database. Thanks again for the suggestion and I will give work with it a bit more tomorrow. -Todd On Tue, Feb 10, 2015 at 5:48 PM, Silvio Fiorito < silvio.fior...@granturing.com> wrote: > Hi Todd, > > What you could do is run some SparkSQL commands immediately after the > Thrift server starts up. Or does Tableau have some init SQL commands you > could run? > > > You can actually load data using SQL, such as: > > create temporary table people using org.apache.spark.sql.json options > (path 'examples/src/main/resources/people.json’) > cache table people > > create temporary table users using org.apache.spark.sql.parquet options > (path 'examples/src/main/resources/users.parquet’) > cache table users > > From: Todd Nist > Date: Tuesday, February 10, 2015 at 3:03 PM > To: "user@spark.apache.org" > Subject: SparkSQL + Tableau Connector > > Hi, > > I'm trying to understand how and what the Tableau connector to SparkSQL > is able to access. My understanding is it needs to connect to the > thriftserver and I am not sure how or if it exposes parquet, json, > schemaRDDs, or does it only expose schemas defined in the metastore / hive. > > > For example, I do the following from the spark-shell which generates a > schemaRDD from a csv file and saves it as a JSON file as well as a parquet > file. > > import *org.apache.sql.SQLContext > *import com.databricks.spark.csv._ > val sqlContext = new SQLContext(sc) > val test = > sqlContext.csfFile("/data/test.csv")test.toJSON().saveAsTextFile("/data/out") > test.saveAsParquetFile("/data/out") > > When I connect from Tableau, the only thing I see is the "default" > schema and nothing in the tables section. > > So my questions are: > > 1. Can the connector fetch or query schemaRDD's saved to Parquet or JSON > files? > 2. Do I need to do something to expose these via hive / metastore other > than creating a table in hive? > 3. Does the thriftserver need to be configured to expose these in some > fashion, sort of related to question 2. > > TIA for the assistance. > > -Todd >