That is a good question. Names with `.` in them are in particular broken by SPARK-5632 <https://issues.apache.org/jira/browse/SPARK-5632>, which I'd like to fix.
There is a more general question of whether strings that are passed to DataFrames should be treated as quoted identifiers (i.e. `as though they were in backticks`) or interpreted as normal identifiers in SQL. I've opened this JIRA to discuss further: SPARK-6865 <https://issues.apache.org/jira/browse/SPARK-6865> On Fri, Apr 10, 2015 at 7:18 PM, Justin Yip <yipjus...@prediction.io> wrote: > Hello, > > Are there any restriction in the column name? I tried to use ".", but > sqlContext.sql cannot find the column. I would guess that "." is tricky as > this affects accessing StructType, but are there any more restriction on > column name? > > scala> case class A(a: Int) > defined class A > > scala> sqlContext.createDataFrame(Seq(A(10), A(20))).withColumn("b.b", > $"a" + 1) > res19: org.apache.spark.sql.DataFrame = [a: int, b.b: int] > > scala> res19.registerTempTable("res19") > > scala> res19.select("a") > res24: org.apache.spark.sql.DataFrame = [a: int] > > scala> res19.select("a", "b.b") > org.apache.spark.sql.AnalysisException: cannot resolve 'b.b' given input > columns a, b.b; > at > org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) > .... > > > Thanks. > > Justin >