[jira] [Updated] (SPARK-18455) General support for subquery processing

Herman van Hovell (JIRA) Wed, 16 Nov 2016 07:59:25 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-18455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Herman van Hovell updated SPARK-18455:
--------------------------------------
    Target Version/s: 2.2.0
         Description: 
Subquery support has been introduced in Spark 2.0. The initial implementation 
covers the most common subquery use case: the ones used in TPC queries for 
instance.

Spark currently supports the following subqueries:
* Uncorrelated Scalar Subqueries. All cases are supported.
* Correlated Scalar Subqueries. We only allow subqueries that are aggregated 
and use equality predicates.
* Predicate Subqueries. IN or Exists type of queries. We allow most predicates, 
except when they are pulled from under an Aggregate or Window operator. In that 
case we only support equality predicates.
However this does not cover the full range of possible subqueries. This, in 
part, has to do with the fact that we currently rewrite all correlated 
subqueries into a (LEFT/LEFT SEMI/LEFT ANTI) join.

We currently lack supports for the following use cases:
* The use of predicate subqueries in a projection.
* The use of non-equality predicates below Aggregates and or Window operators.
* The use of non-Aggregate subqueries for correlated scalar subqueries.

This JIRA aims to lift these current limitations in subquery processing.

  was:Subquery was introduced to Spark in 2.0. With the support of the 
Subquery, a few queries in the TPC benchmarks are able to run without any 
syntax error. However, a number of limitations are also enforced in the 
feature. This JIRA is opened to track the progress of lifting those limitations 
in processing Subquery.


> General support for subquery processing
> ---------------------------------------
>
>                 Key: SPARK-18455
>                 URL: https://issues.apache.org/jira/browse/SPARK-18455
>             Project: Spark
>          Issue Type: Story
>          Components: SQL
>            Reporter: Nattavut Sutyanyong
>
> Subquery support has been introduced in Spark 2.0. The initial implementation 
> covers the most common subquery use case: the ones used in TPC queries for 
> instance.
> Spark currently supports the following subqueries:
> * Uncorrelated Scalar Subqueries. All cases are supported.
> * Correlated Scalar Subqueries. We only allow subqueries that are aggregated 
> and use equality predicates.
> * Predicate Subqueries. IN or Exists type of queries. We allow most 
> predicates, except when they are pulled from under an Aggregate or Window 
> operator. In that case we only support equality predicates.
> However this does not cover the full range of possible subqueries. This, in 
> part, has to do with the fact that we currently rewrite all correlated 
> subqueries into a (LEFT/LEFT SEMI/LEFT ANTI) join.
> We currently lack supports for the following use cases:
> * The use of predicate subqueries in a projection.
> * The use of non-equality predicates below Aggregates and or Window operators.
> * The use of non-Aggregate subqueries for correlated scalar subqueries.
> This JIRA aims to lift these current limitations in subquery processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-18455) General support for subquery processing

Reply via email to