Saurabh Santhosh created SPARK-14948: ----------------------------------------
Summary: Exception when joining DataFrames derived form the same DataFrame Key: SPARK-14948 URL: https://issues.apache.org/jira/browse/SPARK-14948 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.6.0 Reporter: Saurabh Santhosh h2. Spark Analyser is throwing the following exception in a specific scenario : h2. Exception : org.apache.spark.sql.AnalysisException: resolved attribute(s) F1#3 missing from asd#5,F2#4,F1#6,F2#7 in operator !Project [asd#5,F1#3]; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) h2. Code : {code:title=SparkClient.java|borderStyle=solid} StructField[] fields = new StructField[2]; fields[0] = new StructField("F1", DataTypes.StringType, true, Metadata.empty()); fields[1] = new StructField("F2", DataTypes.StringType, true, Metadata.empty()); JavaRDD<Row> rdd = sparkClient.getJavaSparkContext().parallelize(Arrays.asList(RowFactory.create("a", "b"))); DataFrame df = sparkClient.getSparkHiveContext().createDataFrame(rdd, new StructType(fields)); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t1"); DataFrame aliasedDf = sparkClient.getSparkHiveContext().sql("select F1 as asd, F2 from t1"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(aliasedDf, "t2"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t3"); DataFrame join = aliasedDf.join(df, aliasedDf.col("F2").equalTo(df.col("F2")), "inner"); DataFrame select = join.select(aliasedDf.col("asd"), df.col("F1")); select.collect(); {code} h2. Observations : * This issue is related to the Data Type of Fields of the initial Data Frame.(If the Data Type is not String, it will work.) * It works fine if the data frame is registered as a temporary table and an sql (select a.asd,b.F1 from t2 a inner join t3 b on a.F2=b.F2) is written. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org