[ https://issues.apache.org/jira/browse/SPARK-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821983#comment-15821983 ]
Apache Spark commented on SPARK-17237: -------------------------------------- User 'maropu' has created a pull request for this issue: https://github.com/apache/spark/pull/16565 > DataFrame fill after pivot causing org.apache.spark.sql.AnalysisException > ------------------------------------------------------------------------- > > Key: SPARK-17237 > URL: https://issues.apache.org/jira/browse/SPARK-17237 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Jiang Qiqi > Labels: newbie > > I am trying to run a pivot transformation which I ran on a spark1.6 cluster, > namely > sc.parallelize(Seq((2,3,4), (3,4,5))).toDF("a", "b", "c") > res1: org.apache.spark.sql.DataFrame = [a: int, b: int, c: int] > scala> res1.groupBy("a").pivot("b").agg(count("c"), avg("c")).na.fill(0) > res2: org.apache.spark.sql.DataFrame = [a: int, 3_count(c): bigint, 3_avg(c): > double, 4_count(c): bigint, 4_avg(c): double] > scala> res1.groupBy("a").pivot("b").agg(count("c"), avg("c")).na.fill(0).show > +---+----------+--------+----------+--------+ > | a|3_count(c)|3_avg(c)|4_count(c)|4_avg(c)| > +---+----------+--------+----------+--------+ > | 2| 1| 4.0| 0| 0.0| > | 3| 0| 0.0| 1| 5.0| > +---+----------+--------+----------+--------+ > after upgrade the environment to spark2.0, got an error while executing > .na.fill method > scala> sc.parallelize(Seq((2,3,4), (3,4,5))).toDF("a", "b", "c") > res3: org.apache.spark.sql.DataFrame = [a: int, b: int ... 1 more field] > scala> res3.groupBy("a").pivot("b").agg(count("c"), avg("c")).na.fill(0) > org.apache.spark.sql.AnalysisException: syntax error in attribute name: > `3_count(`c`)`; > at > org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:103) > at > org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:113) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:168) > at org.apache.spark.sql.Dataset.resolve(Dataset.scala:218) > at org.apache.spark.sql.Dataset.col(Dataset.scala:921) > at > org.apache.spark.sql.DataFrameNaFunctions.org$apache$spark$sql$DataFrameNaFunctions$$fillCol(DataFrameNaFunctions.scala:411) > at > org.apache.spark.sql.DataFrameNaFunctions$$anonfun$2.apply(DataFrameNaFunctions.scala:162) > at > org.apache.spark.sql.DataFrameNaFunctions$$anonfun$2.apply(DataFrameNaFunctions.scala:159) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:159) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:149) > at > org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:134) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org