[ https://issues.apache.org/jira/browse/SPARK-10838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li closed SPARK-10838. --------------------------- Resolution: Duplicate > Repeat to join one DataFrame twice,there will be AnalysisException. > ------------------------------------------------------------------- > > Key: SPARK-10838 > URL: https://issues.apache.org/jira/browse/SPARK-10838 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.4.1 > Reporter: Yun Zhao > > The detail of exception is: > {quote} > Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved > attribute(s) col_a#1 missing from col_a#0,col_b#2,col_a#3,col_b#4 in operator > !Join Inner, Some((col_b#2 = col_a#1)); > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:154) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:49) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) > at > org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:908) > at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:132) > at > org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) > at org.apache.spark.sql.DataFrame.join(DataFrame.scala:554) > at org.apache.spark.sql.DataFrame.join(DataFrame.scala:521) > {quote} > The related codes are: > {quote} > import org.apache.spark.sql.SQLContext > import org.apache.spark.\{SparkContext, SparkConf} > object DFJoinTest extends App \{ > case class Foo(col_a: String) > case class Bar(col_a: String, col_b: String) > val sc = new SparkContext(new > SparkConf().setMaster("local").setAppName("DFJoinTest")) > val sqlContext = new SQLContext(sc) > import sqlContext.implicits._ > val df1 = sc.parallelize(Array("1")).map(_.split(",")).map(p => > Foo(p(0))).toDF() > val df2 = sc.parallelize(Array("1,1")).map(_.split(",")).map(p => Bar(p(0), > p(1))).toDF() > val df3 = df1.join(df2, df1("col_a") === df2("col_a")).select(df1("col_a"), > $"col_b") > df3.join(df2, df3("col_b") === df2("col_a")).show() > // val df4 = df2.as("df4") > // df3.join(df4, df3("col_b") === df4("col_a")).show() > // df3.join(df2.as("df4"), df3("col_b") === $"df4.col_a").show() > sc.stop() > } > {quote} > When uses > {quote} > val df4 = df2.as("df4") > df3.join(df4, df3("col_b") === df4("col_a")).show() > {quote} > there's errors,but when uses > {quote} > df3.join(df2.as("df4"), df3("col_b") === $"df4.col_a").show() > {quote} > it's normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org