[ https://issues.apache.org/jira/browse/SPARK-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602094#comment-14602094 ]
Reynold Xin commented on SPARK-8621: ------------------------------------ That's one way to do it. The other way is to consider allowing empty (non-null, but 0-char string) column names in analysis. cc [~marmbrus] what do you think? > crosstab exception when one of the value is empty > ------------------------------------------------- > > Key: SPARK-8621 > URL: https://issues.apache.org/jira/browse/SPARK-8621 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Reynold Xin > Priority: Critical > > I think this happened because some value is empty. > {code} > scala> df1.stat.crosstab("role", "lang") > org.apache.spark.sql.AnalysisException: syntax error in attribute name: ; > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.parseAttributeName(LogicalPlan.scala:145) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:135) > at org.apache.spark.sql.DataFrame.resolve(DataFrame.scala:157) > at org.apache.spark.sql.DataFrame.col(DataFrame.scala:603) > at > org.apache.spark.sql.DataFrameNaFunctions.org$apache$spark$sql$DataFrameNaFunctions$$fillCol(DataFrameNaFunctions.scala:394) > at > org.apache.spark.sql.DataFrameNaFunctions$$anonfun$2.apply(DataFrameNaFunctions.scala:160) > at > org.apache.spark.sql.DataFrameNaFunctions$$anonfun$2.apply(DataFrameNaFunctions.scala:157) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:157) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:147) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:132) > at > org.apache.spark.sql.execution.stat.StatFunctions$.crossTabulate(StatFunctions.scala:132) > at > org.apache.spark.sql.DataFrameStatFunctions.crosstab(DataFrameStatFunctions.scala:91) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org