Hi all, I have a Scala project with multiple files: a main file and a file with utility functions on DataFrames. However, using $"colname" to refer to a column of the DataFrame in the utils file (see code below) produces a compile-time error as follows:
"value $ is not a member of StringContext" My utils code works fine if (I work in spark-shell or): - I pass sqlContext as a parameter to each util function - I do import sqlContext.implicits._ inside each util function (as below) (that solution seems ugly and onerous to me?) My questions: 1. Shouldn't implicits be part of a companion object (e.g. of the object SQLContext), rather than (singleton) class instance sqlContext? If implicits are part of the companion object, they could be defined as imports at the top of each file? 2. Where can I put import sqlContext.implicits._ in order not to invoke it in every function? 3. Googling, I saw Scala 2.11 might solve this problem? But won't that cause possible compatibility problems with different jars in 2.10? (I'd rather stick with 2.10) Many thanks for any suggestions and insights! Kristina My toy code (my use case is data munging in preparation for ml/mllib and I wanted to separate preprocessing of data to another file): Hello.scala: import HelloUtils._ object HelloWorld { val conf = new SparkConf().setAppName("HelloDataFrames") val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) import sqlContext.implicits._ case class RecordTest(name:String, category:String, age:Int) val adf = sc.parallelize(Seq( RecordTest("a", "cat1", 1), RecordTest("b", "cat2", 5) )).toDF test(adf, sqlContext) // calling function in HelloUtils } HelloUtils.scala: import org.apache.spark.sql.{DataFrame,SQLContext} object HelloUtils{ def test(adf:DataFrame, sqlContext:SQLContext) = { import sqlContext.implicits._ // I want to get rid of this line adf.filter( $"name" === "a").show() } /// desired way of writing test() function def testDesired(adf:DataFrame) = adf.filter( $"name" === "a").show() }