Hi, I am exploring SparkSQL 1.1.0, I have a problem on LEFT JOIN.
Here is the request: select * from customer left join profile on customer.account_id = profile.account_id The two tables' schema are shown as following: // Table: customer root |-- account_id: string (nullable = false) |-- birthday: string (nullable = true) |-- preferstore: string (nullable = true) |-- registstore: string (nullable = true) |-- gender: string (nullable = true) |-- city_name_en: string (nullable = true) |-- register_date: string (nullable = true) |-- zip: string (nullable = true) // Table: profile root |-- account_id: string (nullable = false) |-- card_type: string (nullable = true) |-- card_upgrade_time_black: string (nullable = true) |-- card_upgrade_time_gold: string (nullable = true) However, I have always an exception: Exception in thread "main" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: *, tree: Project [*] Join LeftOuter, Some(('customer.account_id = 'profile.account_id)) Subquery customer SparkLogicalPlan (ExistingRdd [account_id#0,birthday#1,preferstore#2,registstore#3,gender#4,city_name_en#5,register_date#6,zip#7], MappedRDD[5] at map at SQLFetcher.scala:43) Subquery profile SparkLogicalPlan (ExistingRdd [account_id#8,card_type#9,card_upgrade_time_black#10,card_upgrade_time_gold#11], MappedRDD[12] at map at SQLFetcher.scala:43) I was not sure where the problem is. So I create two simple tables to isolate the problem. // table 1 a b c 4 8 9 1 3 4 3 4 5 // table 2 a b c 1 2 3 4 5 6 This time, it works. So the problem might be in data. I have just sampled some lines of input tables to create new ones. This also works. I am so confused. The problem is in the data, but the error messages are not enough to find it (if I am not missing anything.) Some lines of the sampled tables. // Table: customer [50660,1975-06-05 00:00:00.000,13,12,male,ningboshi,2006-12-14 00:00:00.000,] [50666,1984-02-23 00:00:00.000,72,5,Female,beijingshi,2006-12-14 00:00:00.000,100086] [50680,1976-11-25 00:00:00.000,59,5,Female,beijingshi,2006-12-14 00:00:00.000,100022] [85,1971-03-27 00:00:00.000,2,2,Female,shanghaishi,2005-09-20 00:00:00.000,200336] // Table: profile [1144681,3,2010-02-18 00:00:00.000,2013-02-28 00:00:00.000] [50666,2,2010-10-31 00:00:00.000,] [3930657,1,,] [1056365,2,2009-12-29 00:00:00.000,] Any help ? =) Hao -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-LEFT-JOIN-problem-tp16152.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org