[ 
https://issues.apache.org/jira/browse/SPARK-25961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680925#comment-16680925
 ] 

Hyukjin Kwon commented on SPARK-25961:
--------------------------------------

Questions should go to mailing list. That seems to be intentionally disallowed 
because of non-deterministic behaviour. 

> Random numbers are not supported when handling data skew
> --------------------------------------------------------
>
>                 Key: SPARK-25961
>                 URL: https://issues.apache.org/jira/browse/SPARK-25961
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>         Environment: spark on yarn 2.3.1
>            Reporter: zengxl
>            Priority: Major
>
> My SQL query uses two tables to join. One table join key has null value. I 
> use random value instead of {{null}};however, it has error. The error as 
> follows:
> {code}
> Error in query: nondeterministic expressions are only allowed in
> Project, Filter, Aggregate or Window, found
> scan spark source code is 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis check sql, because the 
> number of random variables is uncertain, it is prohibited
> {code}
> {code}
> case o if o.expressions.exists(!_.deterministic) &&
>  !o.isInstanceOf[Project] && !o.isInstanceOf[Filter] &&
>  !o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] =>
>  // The rule above is used to check Aggregate operator.
>  failAnalysis(
>  s"""nondeterministic expressions are only allowed in
> |Project, Filter, Aggregate or Window, found:|
> |${o.expressions.map(_.sql).mkString(",")}|
> |in operator ${operator.simpleString}
>  """.stripMargin)|
> {code}
>  
> Is it possible to add Join to this code? It's not yet tested.And whether 
> there will be other effects
> {code}
> case o if o.expressions.exists(!_.deterministic) &&
>  !o.isInstanceOf[Project] && !o.isInstanceOf[Filter] &&
>  !o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] +{color:#d04437}&& 
> !o.isInstanceOf[Join]{color}+ =>
>  // The rule above is used to check Aggregate operator.
>  failAnalysis(
>  s"""nondeterministic expressions are only allowed in
> |Project, Filter, Aggregate or Window or Join, found:|
> |${o.expressions.map(_.sql).mkString(",")}|
> |in operator ${operator.simpleString}
>  """.stripMargin)|
> {code}
> This is my SQL query is as below:
> {code}
> SELECT T1.CUST_NO AS CUST_NO,
>        T3.CON_LAST_NAME AS CUST_NAME,
>        T3.CON_SEX_MF AS SEX_CODE,
>        T3.X_POSITION AS POST_LV_CODE
> FROM tmp.ICT_CUST_RANGE_INFO T1
> LEFT JOIN tmp.F_CUST_BASE_INFO_ALL T3 ON CASE
>                                              WHEN coalesce(T1.CUST_NO, '') 
> ='' THEN concat('cust_no', RAND())
>                                              ELSE T1.CUST_NO
>                                          END = T3.BECIF
> AND T3.DATE='20181105'
> WHERE T1.DATE='20181105'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to