A workaround is diffcult. You should consider merging this PR <https://github.com/apache/spark/pull/10128> into your Spark.
"wangshuang [via Apache Spark Developers List]"<ml+s1001551n2195...@n3.nabble.com> wroted at 2017-07-13 18:43: I'm trying to execute hive sql on spark sql (Also on spark thriftserver), For optimizing data skew, we use "case when" to handle null. Simple sql as following: SELECT a.col1 FROM tbl1 a LEFT OUTER JOIN tbl2 b ON CASE WHEN a.col2 IS NULL TNEN cast(rand(9)*1000 - 9999999999 as string) ELSE a.col2 END = b.col3; But I get the error: == Physical Plan == org.apache.spark.sql.AnalysisException: nondeterministic expressions are only allowed in Project, Filter, Aggregate or Window, found: (((CASE WHEN (a.`nav_tcdt` IS NULL) THEN CAST(((rand(9) * CAST(1000 AS DOUBLE)) - CAST(9999999999L AS DOUBLE)) AS STRING) ELSE a.`nav_tcdt` END = c.`site_categ_id`) AND (CAST(a.`nav_tcd` AS INT) = 9)) AND (c.`cur_flag` = 1)) in operator Join LeftOuter, (((CASE WHEN isnull(nav_tcdt#25) THEN cast(((rand(9) * cast(1000 as double)) - cast(9999999999 as double)) as string) ELSE nav_tcdt#25 END = site_categ_id#80) && (cast(nav_tcd#26 as int) = 9)) && (cur_flag#77 = 1)) ;; GlobalLimit 10 +- LocalLimit 10 +- Aggregate [date_id#7, CASE WHEN (cast(city_id#10 as string) IN (cast(19596 as string),cast(20134 as string),cast(10997 as string)) && nav_tcdt#25 RLIKE ^[0-9]+$) THEN city_id#10 ELSE nav_tpa_id#21 END], [date_id#7] +- Filter (date_id#7 = 2017-07-12) +- Join LeftOuter, (((CASE WHEN isnull(nav_tcdt#25) THEN cast(((rand(9) * cast(1000 as double)) - cast(9999999999 as double)) as string) ELSE nav_tcdt#25 END = site_categ_id#80) && (cast(nav_tcd#26 as int) = 9)) && (cur_flag#77 = 1)) :- SubqueryAlias a : +- SubqueryAlias tmp_lifan_trfc_tpa_hive : +- CatalogRelation `tmp`.`tmp_lifan_trfc_tpa_hive`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [date_id#7, chanl_id#8L, pltfm_id#9, city_id#10, sessn_id#11, gu_id#12, nav_refer_page_type_id#13, nav_refer_page_value#14, nav_refer_tpa#15, nav_refer_tpa_id#16, nav_refer_tpc#17, nav_refer_tpi#18, nav_page_type_id#19, nav_page_value#20, nav_tpa_id#21, nav_tpa#22, nav_tpc#23, nav_tpi#24, nav_tcdt#25, nav_tcd#26, nav_tci#27, nav_tce#28, detl_refer_page_type_id#29, detl_refer_page_value#30, ... 33 more fields] +- SubqueryAlias c +- SubqueryAlias dim_site_categ_ext +- CatalogRelation `dw`.`dim_site_categ_ext`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [site_categ_skid#64L, site_categ_type#65, site_categ_code#66, site_categ_name#67, site_categ_parnt_skid#68L, site_categ_kywrd#69, leaf_flg#70L, sort_seq#71L, site_categ_srch_name#72, vsbl_flg#73, delet_flag#74, etl_batch_id#75L, updt_time#76, cur_flag#77, bkgrnd_categ_skid#78L, bkgrnd_categ_id#79L, site_categ_id#80, site_categ_parnt_id#81] Does spark sql not support syntax "case when" in JOIN? Additional, my spark version is 2.2.0. Any help would be greatly appreciated. If you reply to this email, your message will be added to the discussion below: http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-Syntax-case-when-doesn-t-be-supported-in-JOIN-tp21953.html To start a new topic under Apache Spark Developers List, email ml+s1001551n1...@n3.nabble.com To unsubscribe from Apache Spark Developers List, click here. NAML -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Re-SQL-Syntax-case-when-doesn-t-be-supported-in-JOIN-tp21960.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.