jalendhar Baddam created SPARK-21299: ----------------------------------------
Summary: except is throwing the fallowing exception after perform dropDuplicates on the Dataset object Key: SPARK-21299 URL: https://issues.apache.org/jira/browse/SPARK-21299 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 2.1.0 Environment: spark 2.1.0 Reporter: jalendhar Baddam INFO: 2017-07-04 08:28:03 ERROR BdaProcessor:74 - Exception INFO: org.apache.spark.sql.AnalysisException: resolved attribute(s) test_customer_CustID#569 missing from test_customer_ROW_NUM#589L,test_customer_CustID#590,test_customer_Telephone#598L,test_customer_HouseholdID#593,test_customer_Gender#592,test_customer_Title#599,test_customer_Surname#597,test_customer_Occupation#596,test_customer_DOB#591,test_customer_Initials#595,test_customer_Income#594 in operator !Filter (cast(test_customer_CustID#569 as double) > cast(1000 as double));; INFO: Except INFO: :- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- Sort [test_customer_ROW_NUM#212L ASC NULLS FIRST], true INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 INFO: : +- Aggregate [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222], [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- Aggregate [test_customer_Gender#215], [first(test_customer_ROW_NUM#212L, false) AS test_customer_ROW_NUM#212L, first(test_customer_CustID#213, false) AS test_customer_CustID#213, first(test_customer_DOB#214, false) AS test_customer_DOB#214, test_customer_Gender#215, first(test_customer_HouseholdID#216, false) AS test_customer_HouseholdID#216, first(test_customer_Income#217, false) AS test_customer_Income#217, first(test_customer_Initials#218, false) AS test_customer_Initials#218, first(test_customer_Occupation#219, false) AS test_customer_Occupation#219, first(test_customer_Surname#220, false) AS test_customer_Surname#220, first(test_customer_Telephone#221L, false) AS test_customer_Telephone#221L, first(test_customer_Title#222, false) AS test_customer_Title#222] INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, test_customer_HouseholdID#216, test_customer_Income#217, test_customer_Initials#218, test_customer_Occupation#219, test_customer_Surname#220, test_customer_Telephone#221L, test_customer_Title#222] INFO: : +- Filter (cast(test_customer_CustID#213 as double) > cast(1000 as double)) INFO: : +- Project [ROW_NUM#47L AS test_customer_ROW_NUM#212L, CustID#48 AS test_customer_CustID#213, DOB#49 AS test_customer_DOB#214, Gender#50 AS test_customer_Gender#215, HouseholdID#51 AS test_customer_HouseholdID#216, Income#52 AS test_customer_Income#217, Initials#53 AS test_customer_Initials#218, Occupation#54 AS test_customer_Occupation#219, Surname#55 AS test_customer_Surname#220, Telephone#56L AS test_customer_Telephone#221L, Title#57 AS test_customer_Title#222] INFO: : +- SubqueryAlias customer INFO: : +- Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] parquet INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- GlobalLimit 0 INFO: +- LocalLimit 0 INFO: +- Sort [test_customer_ROW_NUM#568L ASC NULLS FIRST], true INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 INFO: +- Aggregate [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577], [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- Aggregate [test_customer_Gender#592], [first(test_customer_ROW_NUM#568L, false) AS test_customer_ROW_NUM#568L, first(test_customer_CustID#569, false) AS test_customer_CustID#569, first(test_customer_DOB#570, false) AS test_customer_DOB#570, test_customer_Gender#592, first(test_customer_HouseholdID#571, false) AS test_customer_HouseholdID#571, first(test_customer_Income#572, false) AS test_customer_Income#572, first(test_customer_Initials#573, false) AS test_customer_Initials#573, first(test_customer_Occupation#574, false) AS test_customer_Occupation#574, first(test_customer_Surname#575, false) AS test_customer_Surname#575, first(test_customer_Telephone#576L, false) AS test_customer_Telephone#576L, first(test_customer_Title#577, false) AS test_customer_Title#577] INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- !Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, test_customer_HouseholdID#571, test_customer_Income#572, test_customer_Initials#573, test_customer_Occupation#574, test_customer_Surname#575, test_customer_Telephone#576L, test_customer_Title#577] INFO: +- !Filter (cast(test_customer_CustID#569 as double) > cast(1000 as double)) INFO: +- Project [ROW_NUM#47L AS test_customer_ROW_NUM#589L, CustID#48 AS test_customer_CustID#590, DOB#49 AS test_customer_DOB#591, Gender#50 AS test_customer_Gender#592, HouseholdID#51 AS test_customer_HouseholdID#593, Income#52 AS test_customer_Income#594, Initials#53 AS test_customer_Initials#595, Occupation#54 AS test_customer_Occupation#596, Surname#55 AS test_customer_Surname#597, Telephone#56L AS test_customer_Telephone#598L, Title#57 AS test_customer_Title#599] INFO: +- SubqueryAlias customer INFO: +- Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] parquet INFO: INFO: at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40) INFO: at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:57) INFO: at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:337) INFO: at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) INFO: at scala.collection.immutable.List.foreach(List.scala:381) INFO: at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) INFO: at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67) INFO: at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57) INFO: at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) INFO: at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) INFO: at org.apache.spark.sql.Dataset.withSetOperator(Dataset.scala:2834) INFO: at org.apache.spark.sql.Dataset.except(Dataset.scala:1652) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org