[ https://issues.apache.org/jira/browse/SPARK-14948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Saurabh Santhosh updated SPARK-14948: ------------------------------------- Description: h2. Spark Analyser is throwing the following exception in a specific scenario : h2. Exception : org.apache.spark.sql.AnalysisException: resolved attribute(s) F1#3 missing from asd#5,F2#4,F1#6,F2#7 in operator !Project [asd#5,F1#3]; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) h2. Code : {code:title=SparkClient.java|borderStyle=solid} StructField[] fields = new StructField[2]; fields[0] = new StructField("F1", DataTypes.StringType, true, Metadata.empty()); fields[1] = new StructField("F2", DataTypes.StringType, true, Metadata.empty()); JavaRDD<Row> rdd = sparkClient.getJavaSparkContext().parallelize(Arrays.asList(RowFactory.create("a", "b"))); DataFrame df = sparkClient.getSparkHiveContext().createDataFrame(rdd, new StructType(fields)); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t1"); DataFrame aliasedDf = sparkClient.getSparkHiveContext().sql("select F1 as asd, F2 from t1"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(aliasedDf, "t2"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t3"); DataFrame join = aliasedDf.join(df, aliasedDf.col("F2").equalTo(df.col("F2")), "inner"); DataFrame select = join.select(aliasedDf.col("asd"), df.col("F1")); select.collect(); {code} h2. Observations : * This issue is related to the Data Type of Fields of the initial Data Frame.(If the Data Type is not String, it will work.) * It works fine if the data frame is registered as a temporary table and an sql (select a.asd,b.F1 from t2 a inner join t3 b on a.F2=b.F2) is written. was: h2. Spark Analyser is throwing the following exception in a specific scenario : h2. Exception : org.apache.spark.sql.AnalysisException: resolved attribute(s) F1#3 missing from asd#5,F2#4,F1#6,F2#7 in operator !Project [asd#5,F1#3]; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) h2. Code : {code:title=SparkClient.java|borderStyle=solid} StructField[] fields = new StructField[2]; fields[0] = new StructField("F1", DataTypes.StringType, true, Metadata.empty()); fields[1] = new StructField("F2", DataTypes.StringType, true, Metadata.empty()); JavaRDD<Row> rdd = sparkClient.getJavaSparkContext().parallelize(Arrays.asList(RowFactory.create("a", "b"))); DataFrame df = sparkClient.getSparkHiveContext().createDataFrame(rdd, new StructType(fields)); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t1"); DataFrame aliasedDf = sparkClient.getSparkHiveContext().sql("select F1 as asd, F2 from t1"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(aliasedDf, "t2"); sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t3"); DataFrame join = aliasedDf.join(df, aliasedDf.col("F2").equalTo(df.col("F2")), "inner"); DataFrame select = join.select(aliasedDf.col("asd"), df.col("F1")); select.collect(); {code} h2. Observations : * This issue is related to the Data Type of Fields of the initial Data Frame.(If the Data Type is not String, it will work.) * It works fine if the data frame is registered as a temporary table and an sql (select a.asd,b.F1 from t2 a inner join t3 b on a.F2=b.F2) is written. > Exception when joining DataFrames derived form the same DataFrame > ----------------------------------------------------------------- > > Key: SPARK-14948 > URL: https://issues.apache.org/jira/browse/SPARK-14948 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.0 > Reporter: Saurabh Santhosh > > h2. Spark Analyser is throwing the following exception in a specific scenario > : > h2. Exception : > org.apache.spark.sql.AnalysisException: resolved attribute(s) F1#3 missing > from asd#5,F2#4,F1#6,F2#7 in operator !Project [asd#5,F1#3]; > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) > h2. Code : > {code:title=SparkClient.java|borderStyle=solid} > StructField[] fields = new StructField[2]; > fields[0] = new StructField("F1", DataTypes.StringType, true, > Metadata.empty()); > fields[1] = new StructField("F2", DataTypes.StringType, true, > Metadata.empty()); > JavaRDD<Row> rdd = > > sparkClient.getJavaSparkContext().parallelize(Arrays.asList(RowFactory.create("a", > "b"))); > DataFrame df = sparkClient.getSparkHiveContext().createDataFrame(rdd, new > StructType(fields)); > sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t1"); > DataFrame aliasedDf = sparkClient.getSparkHiveContext().sql("select F1 as > asd, F2 from t1"); > sparkClient.getSparkHiveContext().registerDataFrameAsTable(aliasedDf, > "t2"); > sparkClient.getSparkHiveContext().registerDataFrameAsTable(df, "t3"); > > DataFrame join = aliasedDf.join(df, > aliasedDf.col("F2").equalTo(df.col("F2")), "inner"); > DataFrame select = join.select(aliasedDf.col("asd"), df.col("F1")); > select.collect(); > {code} > h2. Observations : > * This issue is related to the Data Type of Fields of the initial Data > Frame.(If the Data Type is not String, it will work.) > * It works fine if the data frame is registered as a temporary table and an > sql (select a.asd,b.F1 from t2 a inner join t3 b on a.F2=b.F2) is written. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org