[ 
https://issues.apache.org/jira/browse/SPARK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-4226:
------------------------------
    Description: 
I have a test table defined in Hive as follows:
{code:sql}
CREATE TABLE sparkbug (
  id INT,
  event STRING
) STORED AS PARQUET;
{code}
and insert some sample data with ids 1, 2, 3.

In a Spark shell, I then create a HiveContext and then execute the following 
HQL to test out subquery predicates:
{code}
val hc = HiveContext(hc)
hc.hql("select customerid from sparkbug where customerid in (select customerid 
from sparkbug where customerid in (2,3))")
{code}
I get the following error:
{noformat}
java.lang.RuntimeException: Unsupported language features in query: select 
customerid from sparkbug where customerid in (select customerid from sparkbug 
where customerid in (2,3))
TOK_QUERY
  TOK_FROM
    TOK_TABREF
      TOK_TABNAME
        sparkbug
  TOK_INSERT
    TOK_DESTINATION
      TOK_DIR
        TOK_TMP_FILE
    TOK_SELECT
      TOK_SELEXPR
        TOK_TABLE_OR_COL
          customerid
    TOK_WHERE
      TOK_SUBQUERY_EXPR
        TOK_SUBQUERY_OP
          in
        TOK_QUERY
          TOK_FROM
            TOK_TABREF
              TOK_TABNAME
                sparkbug
          TOK_INSERT
            TOK_DESTINATION
              TOK_DIR
                TOK_TMP_FILE
            TOK_SELECT
              TOK_SELEXPR
                TOK_TABLE_OR_COL
                  customerid
            TOK_WHERE
              TOK_FUNCTION
                in
                TOK_TABLE_OR_COL
                  customerid
                2
                3
        TOK_TABLE_OR_COL
          customerid

scala.NotImplementedError: No parse rules for ASTNode type: 817, text: 
TOK_SUBQUERY_EXPR :
TOK_SUBQUERY_EXPR
  TOK_SUBQUERY_OP
    in
  TOK_QUERY
    TOK_FROM
      TOK_TABREF
        TOK_TABNAME
          sparkbug
    TOK_INSERT
      TOK_DESTINATION
        TOK_DIR
          TOK_TMP_FILE
      TOK_SELECT
        TOK_SELEXPR
          TOK_TABLE_OR_COL
            customerid
      TOK_WHERE
        TOK_FUNCTION
          in
          TOK_TABLE_OR_COL
            customerid
          2
          3
  TOK_TABLE_OR_COL
    customerid
" +
         
org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098)
        
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252)
        at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
        at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
        at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
{noformat}
[This 
thread|http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html]
 also brings up lack of subquery support in SparkSQL. It would be nice to have 
subquery predicate support in a near, future release (1.3, maybe?).

  was:
I have a test table defined in Hive as follows:

CREATE TABLE sparkbug (
  id INT,
  event STRING
) STORED AS PARQUET;

and insert some sample data with ids 1, 2, 3.

In a Spark shell, I then create a HiveContext and then execute the following 
HQL to test out subquery predicates:

val hc = HiveContext(hc)
hc.hql("select customerid from sparkbug where customerid in (select customerid 
from sparkbug where customerid in (2,3))")

I get the following error:

java.lang.RuntimeException: Unsupported language features in query: select 
customerid from sparkbug where customerid in (select customerid from sparkbug 
where customerid in (2,3))
TOK_QUERY
  TOK_FROM
    TOK_TABREF
      TOK_TABNAME
        sparkbug
  TOK_INSERT
    TOK_DESTINATION
      TOK_DIR
        TOK_TMP_FILE
    TOK_SELECT
      TOK_SELEXPR
        TOK_TABLE_OR_COL
          customerid
    TOK_WHERE
      TOK_SUBQUERY_EXPR
        TOK_SUBQUERY_OP
          in
        TOK_QUERY
          TOK_FROM
            TOK_TABREF
              TOK_TABNAME
                sparkbug
          TOK_INSERT
            TOK_DESTINATION
              TOK_DIR
                TOK_TMP_FILE
            TOK_SELECT
              TOK_SELEXPR
                TOK_TABLE_OR_COL
                  customerid
            TOK_WHERE
              TOK_FUNCTION
                in
                TOK_TABLE_OR_COL
                  customerid
                2
                3
        TOK_TABLE_OR_COL
          customerid

scala.NotImplementedError: No parse rules for ASTNode type: 817, text: 
TOK_SUBQUERY_EXPR :
TOK_SUBQUERY_EXPR
  TOK_SUBQUERY_OP
    in
  TOK_QUERY
    TOK_FROM
      TOK_TABREF
        TOK_TABNAME
          sparkbug
    TOK_INSERT
      TOK_DESTINATION
        TOK_DIR
          TOK_TMP_FILE
      TOK_SELECT
        TOK_SELEXPR
          TOK_TABLE_OR_COL
            customerid
      TOK_WHERE
        TOK_FUNCTION
          in
          TOK_TABLE_OR_COL
            customerid
          2
          3
  TOK_TABLE_OR_COL
    customerid
" +
         
org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098)
        
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252)
        at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
        at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
        at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)

This thread

http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html

also brings up lack of subquery support in SparkSQL. It would be nice to have 
subquery predicate support in a near, future release (1.3, maybe?).


> SparkSQL - Add support for subqueries in predicates
> ---------------------------------------------------
>
>                 Key: SPARK-4226
>                 URL: https://issues.apache.org/jira/browse/SPARK-4226
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.2.0
>         Environment: Spark 1.2 snapshot
>            Reporter: Terry Siu
>
> I have a test table defined in Hive as follows:
> {code:sql}
> CREATE TABLE sparkbug (
>   id INT,
>   event STRING
> ) STORED AS PARQUET;
> {code}
> and insert some sample data with ids 1, 2, 3.
> In a Spark shell, I then create a HiveContext and then execute the following 
> HQL to test out subquery predicates:
> {code}
> val hc = HiveContext(hc)
> hc.hql("select customerid from sparkbug where customerid in (select 
> customerid from sparkbug where customerid in (2,3))")
> {code}
> I get the following error:
> {noformat}
> java.lang.RuntimeException: Unsupported language features in query: select 
> customerid from sparkbug where customerid in (select customerid from sparkbug 
> where customerid in (2,3))
> TOK_QUERY
>   TOK_FROM
>     TOK_TABREF
>       TOK_TABNAME
>         sparkbug
>   TOK_INSERT
>     TOK_DESTINATION
>       TOK_DIR
>         TOK_TMP_FILE
>     TOK_SELECT
>       TOK_SELEXPR
>         TOK_TABLE_OR_COL
>           customerid
>     TOK_WHERE
>       TOK_SUBQUERY_EXPR
>         TOK_SUBQUERY_OP
>           in
>         TOK_QUERY
>           TOK_FROM
>             TOK_TABREF
>               TOK_TABNAME
>                 sparkbug
>           TOK_INSERT
>             TOK_DESTINATION
>               TOK_DIR
>                 TOK_TMP_FILE
>             TOK_SELECT
>               TOK_SELEXPR
>                 TOK_TABLE_OR_COL
>                   customerid
>             TOK_WHERE
>               TOK_FUNCTION
>                 in
>                 TOK_TABLE_OR_COL
>                   customerid
>                 2
>                 3
>         TOK_TABLE_OR_COL
>           customerid
> scala.NotImplementedError: No parse rules for ASTNode type: 817, text: 
> TOK_SUBQUERY_EXPR :
> TOK_SUBQUERY_EXPR
>   TOK_SUBQUERY_OP
>     in
>   TOK_QUERY
>     TOK_FROM
>       TOK_TABREF
>         TOK_TABNAME
>           sparkbug
>     TOK_INSERT
>       TOK_DESTINATION
>         TOK_DIR
>           TOK_TMP_FILE
>       TOK_SELECT
>         TOK_SELEXPR
>           TOK_TABLE_OR_COL
>             customerid
>       TOK_WHERE
>         TOK_FUNCTION
>           in
>           TOK_TABLE_OR_COL
>             customerid
>           2
>           3
>   TOK_TABLE_OR_COL
>     customerid
> " +
>          
> org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098)
>         
>         at scala.sys.package$.error(package.scala:27)
>         at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252)
>         at 
> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
>         at 
> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
>         at 
> scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
> {noformat}
> [This 
> thread|http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html]
>  also brings up lack of subquery support in SparkSQL. It would be nice to 
> have subquery predicate support in a near, future release (1.3, maybe?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to