[ https://issues.apache.org/jira/browse/SPARK-21299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074241#comment-16074241 ]
jalendhar Baddam edited comment on SPARK-21299 at 7/5/17 4:51 AM: ------------------------------------------------------------------ Still we are getting the issue. Dataset<Row> ds=spark.read().table("tab1"); ds=ds.dropDuplicates("colname"); ds1=ds.limit(10); ds=ds.except(ds1)//here its causing the above exception I am using the version 2.1.1 <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.1.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.1.1</version> <scope>provided</scope> </dependency> was (Author: jalendhar): Still we are getting the issue. Dataset<Row> ds=spark.read().table("tab1"); ds=ds.dropDuplicates("colname"); ds1=ds.limit(10); ds=ds.except(ds1)//here its causing the above exception > except is throwing the fallowing exception after perform dropDuplicates on > the Dataset object > --------------------------------------------------------------------------------------------- > > Key: SPARK-21299 > URL: https://issues.apache.org/jira/browse/SPARK-21299 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 2.1.0 > Environment: spark 2.1.0 > Reporter: jalendhar Baddam > > INFO: org.apache.spark.sql.AnalysisException: resolved attribute(s) > test_customer_CustID#569 missing from > test_customer_ROW_NUM#589L,test_customer_CustID#590,test_customer_Telephone#598L,test_customer_HouseholdID#593,test_customer_Gender#592,test_customer_Title#599,test_customer_Surname#597,test_customer_Occupation#596,test_customer_DOB#591,test_customer_Initials#595,test_customer_Income#594 > in operator !Filter (cast(test_customer_CustID#569 as double) > cast(1000 as > double));; > INFO: Except > INFO: :- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, > test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Sort [test_customer_ROW_NUM#212L ASC NULLS FIRST], true > INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, > test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 > INFO: : +- Aggregate [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222], [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Aggregate [test_customer_Gender#215], > [first(test_customer_ROW_NUM#212L, false) AS test_customer_ROW_NUM#212L, > first(test_customer_CustID#213, false) AS test_customer_CustID#213, > first(test_customer_DOB#214, false) AS test_customer_DOB#214, > test_customer_Gender#215, first(test_customer_HouseholdID#216, false) AS > test_customer_HouseholdID#216, first(test_customer_Income#217, false) AS > test_customer_Income#217, first(test_customer_Initials#218, false) AS > test_customer_Initials#218, first(test_customer_Occupation#219, false) AS > test_customer_Occupation#219, first(test_customer_Surname#220, false) AS > test_customer_Surname#220, first(test_customer_Telephone#221L, false) AS > test_customer_Telephone#221L, first(test_customer_Title#222, false) AS > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Filter (cast(test_customer_CustID#213 as > double) > cast(1000 as double)) > INFO: : +- Project [ROW_NUM#47L AS > test_customer_ROW_NUM#212L, CustID#48 AS test_customer_CustID#213, DOB#49 AS > test_customer_DOB#214, Gender#50 AS test_customer_Gender#215, HouseholdID#51 > AS test_customer_HouseholdID#216, Income#52 AS test_customer_Income#217, > Initials#53 AS test_customer_Initials#218, Occupation#54 AS > test_customer_Occupation#219, Surname#55 AS test_customer_Surname#220, > Telephone#56L AS test_customer_Telephone#221L, Title#57 AS > test_customer_Title#222] > INFO: : +- SubqueryAlias customer > INFO: : +- > Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] > parquet > INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, > test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- GlobalLimit 0 > INFO: +- LocalLimit 0 > INFO: +- Sort [test_customer_ROW_NUM#568L ASC NULLS FIRST], true > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 > INFO: +- Aggregate [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577], [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Aggregate [test_customer_Gender#592], > [first(test_customer_ROW_NUM#568L, false) AS test_customer_ROW_NUM#568L, > first(test_customer_CustID#569, false) AS test_customer_CustID#569, > first(test_customer_DOB#570, false) AS test_customer_DOB#570, > test_customer_Gender#592, first(test_customer_HouseholdID#571, false) AS > test_customer_HouseholdID#571, first(test_customer_Income#572, false) AS > test_customer_Income#572, first(test_customer_Initials#573, false) AS > test_customer_Initials#573, first(test_customer_Occupation#574, false) AS > test_customer_Occupation#574, first(test_customer_Surname#575, false) AS > test_customer_Surname#575, first(test_customer_Telephone#576L, false) AS > test_customer_Telephone#576L, first(test_customer_Title#577, false) AS > test_customer_Title#577] > INFO: +- Project > [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, > test_customer_Gender#592, test_customer_HouseholdID#571, > test_customer_Income#572, test_customer_Initials#573, > test_customer_Occupation#574, test_customer_Surname#575, > test_customer_Telephone#576L, test_customer_Title#577] > INFO: +- !Project > [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, > test_customer_Gender#592, test_customer_HouseholdID#571, > test_customer_Income#572, test_customer_Initials#573, > test_customer_Occupation#574, test_customer_Surname#575, > test_customer_Telephone#576L, test_customer_Title#577] > INFO: +- !Filter > (cast(test_customer_CustID#569 as double) > cast(1000 as double)) > INFO: +- Project [ROW_NUM#47L AS > test_customer_ROW_NUM#589L, CustID#48 AS test_customer_CustID#590, DOB#49 AS > test_customer_DOB#591, Gender#50 AS test_customer_Gender#592, HouseholdID#51 > AS test_customer_HouseholdID#593, Income#52 AS test_customer_Income#594, > Initials#53 AS test_customer_Initials#595, Occupation#54 AS > test_customer_Occupation#596, Surname#55 AS test_customer_Surname#597, > Telephone#56L AS test_customer_Telephone#598L, Title#57 AS > test_customer_Title#599] > INFO: +- SubqueryAlias customer > INFO: +- > Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] > parquet > INFO: > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40) > INFO: at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:57) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:337) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67) > INFO: at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57) > INFO: at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) > INFO: at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) > INFO: at > org.apache.spark.sql.Dataset.withSetOperator(Dataset.scala:2834) > INFO: at org.apache.spark.sql.Dataset.except(Dataset.scala:1652) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org