[ 
https://issues.apache.org/jira/browse/SPARK-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261028#comment-15261028
 ] 

Frederick Reiss commented on SPARK-14781:
-----------------------------------------

I'm not so sure about Q45. Here's the template for Q45:

{noformat}
[_LIMITA] select [_LIMITB] ca_zip, [GBOBC], sum(ws_sales_price)
 from web_sales, customer, customer_address, date_dim, item
 where ws_bill_customer_sk = c_customer_sk
        and c_current_addr_sk = ca_address_sk 
        and ws_item_sk = i_item_sk 
        and ( substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475', 
'85392', '85460', '80348', '81792')
              or 
              i_item_id in (select i_item_id
                             from item
                             where i_item_sk in (2, 3, 5, 7, 11, 13, 17, 19, 
23, 29)
                             )
            )
        and ws_sold_date_sk = d_date_sk
        and d_qoy = [QOY] and d_year = [YEAR]
 group by ca_zip, [GBOBC]
 order by ca_zip, [GBOBC]
 [_LIMITC];
{noformat}
This query does contain a subquery inside a disjunction ({{...or i_item_id in 
(select...}}), but that subquery is not correlated. What is needed there is for 
that subquery to be added to the list of noncorrelated subqueries evaluated in 
{{SparkPlan.waitForSubqueries()}} and a placeholder for those query results 
inserted into the plan.

Q10 and Q35 have correlated EXISTS subqueries inside disjunctions.

> Support subquery in nested predicates
> -------------------------------------
>
>                 Key: SPARK-14781
>                 URL: https://issues.apache.org/jira/browse/SPARK-14781
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Davies Liu
>
> Right now, we does not support nested IN/EXISTS subquery, for example 
> EXISTS( x1) OR EXISTS( x2)
> In order to do that, we could use an internal-only join type SemiPlus, which 
> will output every row from left, plus additional column as the result of join 
> condition. Then we could replace the EXISTS() or IN() by the result column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to