Hi, Thank you Liquan. I just missed some in information in my previous post.
I just solved the problem. Actually, I use the first line(schema header) of the CSV file to generate StructType and StructField. However, the input file is UTF-8 Unicode (*with* BOM), so the first char of the file is #65279(or U+FEFF). As a result, the first field has a leading #65279 char. When querying, I just used account_id, so SparkSQL cannot find the given field in AST, while the one in AST is #65279account_id. So the solution this to convert input file to UTF-8 Unicode (*without* BOM), that will remove the leading #65279. Everything is fine now. As #65279 is not printable, it's not easy to find the bug, given that the error msg made me think it's SparkSQL's problem. Really hope that the exception msg of SparkSQL could be a little more explicit for developer. Regards, Hao -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-LEFT-JOIN-problem-tp16152p16277.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org