Quickly reviewing the latest SQL Programming Guide <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md> (in github) I had a couple of quick questions:
1) Do we need to instantiate the SparkContext as per // sc is an existing SparkContext. val sqlContext = new org.apache.spark.sql.SQLContext(sc) Within Spark 1.3 the sqlContext is already available so probably do not need to make this call. 2) Importing org.apache.spark.sql._ should bring in both SQL data types, struct types, and row // Import Spark SQL data types and Row. import org.apache.spark.sql._ Currently with Spark 1.3 RC1, it appears org.apache.spark.sql._ only brings in row. scala> import org.apache.spark.sql._ import org.apache.spark.sql._ scala> val schema = | StructType( | schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true))) <console>:25: error: not found: value StructType StructType( But if I also import in org.apache.spark.sql.types_ scala> import org.apache.spark.sql.types._ import org.apache.spark.sql.types._ scala> val schema = | StructType( | schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true))) schema: org.apache.spark.sql.types.StructType = StructType(StructField(DeviceMake,StringType,true), StructField(Country,StringType,true)) Wondering if this is by design or perhaps a quick documentation / package update is warranted.