[jira] [Resolved] (SPARK-46240) Add ExecutedPlanPrepRules to SparkSessionExtensions

2023-12-11 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 resolved SPARK-46240.

Resolution: Won't Do

As the discussion in [https://github.com/apache/spark/pull/44254]
columnar rule is enough for this case

> Add ExecutedPlanPrepRules to SparkSessionExtensions
> ---
>
> Key: SPARK-46240
> URL: https://issues.apache.org/jira/browse/SPARK-46240
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.3.0, 3.4.0
>Reporter: jiang13021
>Priority: Major
>  Labels: pull-request-available
>
> Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
> However, users do not have the ability to add rules in this context.
> {code:java}
> // org.apache.spark.sql.execution.QueryExecution#preparations  
> private[execution] def preparations(
> sparkSession: SparkSession,
> adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
> subquery: Boolean): Seq[Rule[SparkPlan]] = {
>   // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
> rules will be no-op
>   // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
>   adaptiveExecutionRule.toSeq ++
>   Seq(
> CoalesceBucketsInJoin,
> PlanDynamicPruningFilters(sparkSession),
> PlanSubqueries(sparkSession),
> RemoveRedundantProjects,
> EnsureRequirements(),
> // `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` 
> to guarantee the
> // sort order of each node is checked to be valid.
> ReplaceHashWithSortAgg,
> // `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
> guarantee the same
> // number of partitions when instantiating PartitioningCollection.
> RemoveRedundantSorts,
> DisableUnnecessaryBucketedScan,
> ApplyColumnarRulesAndInsertTransitions(
>   sparkSession.sessionState.columnarRules, outputsColumnar = false),
> CollapseCodegenStages()) ++
> (if (subquery) {
>   Nil
> } else {
>   Seq(ReuseExchangeAndSubquery)
> })
> }{code}
> We need to add some "Rule[SparkPlan]"s at this position because currently, 
> all such rules are present in AQE, which requires users to use AQE and meet 
> the requirements to enter AdaptiveSparkPlanExec. This makes it difficult to 
> implement certain extensions for simple SQLs.
> For example, adding some new datasource filters for external data sources is 
> challenging. Modifying DataSourceStrategy directly is not conducive to 
> staying in sync with future advancements in the community. Additionally, 
> customizing the Strategy makes it difficult to append new functionalities in 
> an incremental manner. If we define AQE rules, they would not be effective 
> for the simplest 'SELECT * FROM ... WHERE ...' statements. Therefore, it is 
> necessary to introduce a customizable Rule[SparkPlan] between sparkPlan and 
> executedPlan.
> We could add an extension called "ExecutedPlanPrepRule" to 
> SparkSessionExtensions,  which would allow users to add their own rules.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46240) Add ExecutedPlanPrepRules to SparkSessionExtensions

2023-12-05 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-46240:
---
Description: 
Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
However, users do not have the ability to add rules in this context.
{code:java}
// org.apache.spark.sql.execution.QueryExecution#preparations  
private[execution] def preparations(
sparkSession: SparkSession,
adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
subquery: Boolean): Seq[Rule[SparkPlan]] = {
  // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
rules will be no-op
  // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
  adaptiveExecutionRule.toSeq ++
  Seq(
CoalesceBucketsInJoin,
PlanDynamicPruningFilters(sparkSession),
PlanSubqueries(sparkSession),
RemoveRedundantProjects,
EnsureRequirements(),
// `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` to 
guarantee the
// sort order of each node is checked to be valid.
ReplaceHashWithSortAgg,
// `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
guarantee the same
// number of partitions when instantiating PartitioningCollection.
RemoveRedundantSorts,
DisableUnnecessaryBucketedScan,
ApplyColumnarRulesAndInsertTransitions(
  sparkSession.sessionState.columnarRules, outputsColumnar = false),
CollapseCodegenStages()) ++
(if (subquery) {
  Nil
} else {
  Seq(ReuseExchangeAndSubquery)
})
}{code}
We need to add some "Rule[SparkPlan]"s at this position because currently, all 
such rules are present in AQE, which requires users to use AQE and meet the 
requirements to enter AdaptiveSparkPlanExec. This makes it difficult to 
implement certain extensions for simple SQLs.

For example, adding some new datasource filters for external data sources is 
challenging. Modifying DataSourceStrategy directly is not conducive to staying 
in sync with future advancements in the community. Additionally, customizing 
the Strategy makes it difficult to append new functionalities in an incremental 
manner. If we define AQE rules, they would not be effective for the simplest 
'SELECT * FROM ... WHERE ...' statements. Therefore, it is necessary to 
introduce a customizable Rule[SparkPlan] between sparkPlan and executedPlan.

We could add an extension called "ExecutedPlanPrepRule" to 
SparkSessionExtensions,  which would allow users to add their own rules.

  was:
Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
However, users do not have the ability to add rules in this context.
{code:java}
// org.apache.spark.sql.execution.QueryExecution#preparations  
private[execution] def preparations(
sparkSession: SparkSession,
adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
subquery: Boolean): Seq[Rule[SparkPlan]] = {
  // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
rules will be no-op
  // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
  adaptiveExecutionRule.toSeq ++
  Seq(
CoalesceBucketsInJoin,
PlanDynamicPruningFilters(sparkSession),
PlanSubqueries(sparkSession),
RemoveRedundantProjects,
EnsureRequirements(),
// `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` to 
guarantee the
// sort order of each node is checked to be valid.
ReplaceHashWithSortAgg,
// `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
guarantee the same
// number of partitions when instantiating PartitioningCollection.
RemoveRedundantSorts,
DisableUnnecessaryBucketedScan,
ApplyColumnarRulesAndInsertTransitions(
  sparkSession.sessionState.columnarRules, outputsColumnar = false),
CollapseCodegenStages()) ++
(if (subquery) {
  Nil
} else {
  Seq(ReuseExchangeAndSubquery)
})
}{code}
We could add an extension called "PrepExecutedPlanRule" to 
SparkSessionExtensions,  which would allow users to add their own rules.

Summary: Add ExecutedPlanPrepRules to SparkSessionExtensions  (was: Add 
PrepExecutedPlanRule to SparkSessionExtensions)

> Add ExecutedPlanPrepRules to SparkSessionExtensions
> ---
>
> Key: SPARK-46240
> URL: https://issues.apache.org/jira/browse/SPARK-46240
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.3.0, 3.4.0
>Reporter: jiang13021
>Priority: Major
>
> Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
> However, users do not have the ability to add rules in this context.
> {code:java}
> // org.apache.spark.sql.execution.QueryExecution#preparations  
> private[execution] def preparations(
> 

[jira] [Created] (SPARK-46240) Add PrepExecutedPlanRule to SparkSessionExtensions

2023-12-04 Thread jiang13021 (Jira)
jiang13021 created SPARK-46240:
--

 Summary: Add PrepExecutedPlanRule to SparkSessionExtensions
 Key: SPARK-46240
 URL: https://issues.apache.org/jira/browse/SPARK-46240
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4.0, 3.3.0, 3.2.0
Reporter: jiang13021


Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
However, users do not have the ability to add rules in this context.
{code:java}
// org.apache.spark.sql.execution.QueryExecution#preparations  
private[execution] def preparations(
sparkSession: SparkSession,
adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
subquery: Boolean): Seq[Rule[SparkPlan]] = {
  // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
rules will be no-op
  // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
  adaptiveExecutionRule.toSeq ++
  Seq(
CoalesceBucketsInJoin,
PlanDynamicPruningFilters(sparkSession),
PlanSubqueries(sparkSession),
RemoveRedundantProjects,
EnsureRequirements(),
// `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` to 
guarantee the
// sort order of each node is checked to be valid.
ReplaceHashWithSortAgg,
// `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
guarantee the same
// number of partitions when instantiating PartitioningCollection.
RemoveRedundantSorts,
DisableUnnecessaryBucketedScan,
ApplyColumnarRulesAndInsertTransitions(
  sparkSession.sessionState.columnarRules, outputsColumnar = false),
CollapseCodegenStages()) ++
(if (subquery) {
  Nil
} else {
  Seq(ReuseExchangeAndSubquery)
})
}{code}
We could add an extension called "PrepExecutedPlanRule" to 
SparkSessionExtensions,  which would allow users to add their own rules.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46240) Add PrepExecutedPlanRule to SparkSessionExtensions

2023-12-04 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-46240:
---
Description: 
Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
However, users do not have the ability to add rules in this context.
{code:java}
// org.apache.spark.sql.execution.QueryExecution#preparations  
private[execution] def preparations(
sparkSession: SparkSession,
adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
subquery: Boolean): Seq[Rule[SparkPlan]] = {
  // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
rules will be no-op
  // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
  adaptiveExecutionRule.toSeq ++
  Seq(
CoalesceBucketsInJoin,
PlanDynamicPruningFilters(sparkSession),
PlanSubqueries(sparkSession),
RemoveRedundantProjects,
EnsureRequirements(),
// `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` to 
guarantee the
// sort order of each node is checked to be valid.
ReplaceHashWithSortAgg,
// `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
guarantee the same
// number of partitions when instantiating PartitioningCollection.
RemoveRedundantSorts,
DisableUnnecessaryBucketedScan,
ApplyColumnarRulesAndInsertTransitions(
  sparkSession.sessionState.columnarRules, outputsColumnar = false),
CollapseCodegenStages()) ++
(if (subquery) {
  Nil
} else {
  Seq(ReuseExchangeAndSubquery)
})
}{code}
We could add an extension called "PrepExecutedPlanRule" to 
SparkSessionExtensions,  which would allow users to add their own rules.

  was:
Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
However, users do not have the ability to add rules in this context.
{code:java}
// org.apache.spark.sql.execution.QueryExecution#preparations  
private[execution] def preparations(
sparkSession: SparkSession,
adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
subquery: Boolean): Seq[Rule[SparkPlan]] = {
  // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
rules will be no-op
  // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
  adaptiveExecutionRule.toSeq ++
  Seq(
CoalesceBucketsInJoin,
PlanDynamicPruningFilters(sparkSession),
PlanSubqueries(sparkSession),
RemoveRedundantProjects,
EnsureRequirements(),
// `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` to 
guarantee the
// sort order of each node is checked to be valid.
ReplaceHashWithSortAgg,
// `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
guarantee the same
// number of partitions when instantiating PartitioningCollection.
RemoveRedundantSorts,
DisableUnnecessaryBucketedScan,
ApplyColumnarRulesAndInsertTransitions(
  sparkSession.sessionState.columnarRules, outputsColumnar = false),
CollapseCodegenStages()) ++
(if (subquery) {
  Nil
} else {
  Seq(ReuseExchangeAndSubquery)
})
}{code}
We could add an extension called "PrepExecutedPlanRule" to 
SparkSessionExtensions,  which would allow users to add their own rules.


> Add PrepExecutedPlanRule to SparkSessionExtensions
> --
>
> Key: SPARK-46240
> URL: https://issues.apache.org/jira/browse/SPARK-46240
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.3.0, 3.4.0
>Reporter: jiang13021
>Priority: Major
>
> Some rules (Rule[SparkPlan]) are applied when preparing for the executedPlan. 
> However, users do not have the ability to add rules in this context.
> {code:java}
> // org.apache.spark.sql.execution.QueryExecution#preparations  
> private[execution] def preparations(
> sparkSession: SparkSession,
> adaptiveExecutionRule: Option[InsertAdaptiveSparkPlan] = None,
> subquery: Boolean): Seq[Rule[SparkPlan]] = {
>   // `AdaptiveSparkPlanExec` is a leaf node. If inserted, all the following 
> rules will be no-op
>   // as the original plan is hidden behind `AdaptiveSparkPlanExec`.
>   adaptiveExecutionRule.toSeq ++
>   Seq(
> CoalesceBucketsInJoin,
> PlanDynamicPruningFilters(sparkSession),
> PlanSubqueries(sparkSession),
> RemoveRedundantProjects,
> EnsureRequirements(),
> // `ReplaceHashWithSortAgg` needs to be added after `EnsureRequirements` 
> to guarantee the
> // sort order of each node is checked to be valid.
> ReplaceHashWithSortAgg,
> // `RemoveRedundantSorts` needs to be added after `EnsureRequirements` to 
> guarantee the same
> // number of partitions when instantiating PartitioningCollection.
> RemoveRedundantSorts,
> DisableUnnecessaryBucketedScan,
> 

[jira] [Created] (SPARK-43218) Support "ESCAPE BY" in SparkScriptTransformationExec

2023-04-20 Thread jiang13021 (Jira)
jiang13021 created SPARK-43218:
--

 Summary: Support "ESCAPE BY" in SparkScriptTransformationExec
 Key: SPARK-43218
 URL: https://issues.apache.org/jira/browse/SPARK-43218
 Project: Spark
  Issue Type: Wish
  Components: SQL
Affects Versions: 3.4.0, 3.3.0, 3.2.0
Reporter: jiang13021


If I don't `set spark.sql.catalogImplementation=hive`, I can't use "SELECT 
TRANSFORM" with "ESCAPE BY". Although HiveScriptTransform also doesn't 
implement ESCAPE BY, I can use RowFormatSerde to achieve this ability.

 

In fact, HiveScriptTransform doesn't need to connect to Hive Metastore. I can 
use reflection to forcibly call HiveScriptTransformationExec without connecting 
to Hive Metastore, and it can work properly. Maybe HiveScriptTransform can be 
more generic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42552) Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"

2023-03-03 Thread jiang13021 (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696395#comment-17696395
 ] 

jiang13021 commented on SPARK-42552:


The problem may be in this location: 
[https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala#L126]

When the `PredictionMode` is `SLL`, `AstBuilder` will throw `ParseException` 
instead of `ParseCancellationException`,so the parser doesn't try `LL` mode. In 
fact, if we use `LL` mode, we can parse the sql correctly.

> Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"
> ---
>
> Key: SPARK-42552
> URL: https://issues.apache.org/jira/browse/SPARK-42552
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.3
> Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
> 1.8.0_345)
> Spark version 3.2.3-SNAPSHOT
>Reporter: jiang13021
>Priority: Major
> Fix For: 3.2.3
>
>
> When I run sql
> {code:java}
> scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
> I get ParseException:
> {code:java}
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
> SELECT 1 UNION SELECT 1;
> ---^^^  at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
>   at 
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
>   at 
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
>   ... 47 elided
>  {code}
> If I run with parentheses , it works well 
> {code:java}
> scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
> res4: org.apache.spark.sql.DataFrame = [1: int]{code}
> This should be a bug
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42552) Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"

2023-03-03 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-42552:
---
Priority: Major  (was: Minor)

> Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"
> ---
>
> Key: SPARK-42552
> URL: https://issues.apache.org/jira/browse/SPARK-42552
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.3
> Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
> 1.8.0_345)
> Spark version 3.2.3-SNAPSHOT
>Reporter: jiang13021
>Priority: Major
> Fix For: 3.2.3
>
>
> When I run sql
> {code:java}
> scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
> I get ParseException:
> {code:java}
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
> SELECT 1 UNION SELECT 1;
> ---^^^  at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
>   at 
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
>   at 
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
>   ... 47 elided
>  {code}
> If I run with parentheses , it works well 
> {code:java}
> scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
> res4: org.apache.spark.sql.DataFrame = [1: int]{code}
> This should be a bug
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42553) NonReserved keyword "interval" can't be column name

2023-02-24 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-42553:
---
Affects Version/s: 3.3.2
   3.3.1
   3.3.0

> NonReserved keyword "interval" can't be column name
> ---
>
> Key: SPARK-42553
> URL: https://issues.apache.org/jira/browse/SPARK-42553
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.1, 3.2.3, 3.3.2
> Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
> 1.8.0_345)
> Spark version 3.2.3-SNAPSHOT
>Reporter: jiang13021
>Priority: Major
>
> INTERVAL is a Non-Reserved keyword in spark. "Non-Reserved keywords" have a 
> special meaning in particular contexts and can be used as identifiers in 
> other contexts. So by design, interval can be used as a column name.
> {code:java}
> scala> spark.sql("select interval from mytable")
> org.apache.spark.sql.catalyst.parser.ParseException:
> at least one time unit should be given for interval literal(line 1, pos 7)== 
> SQL ==
> select interval from mytable
> ---^^^  at 
> org.apache.spark.sql.errors.QueryParsingErrors$.invalidIntervalLiteralError(QueryParsingErrors.scala:196)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$parseIntervalLiteral$1(AstBuilder.scala:2481)
>   at 
> org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.parseIntervalLiteral(AstBuilder.scala:2466)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitInterval$1(AstBuilder.scala:2432)
>   at 
> org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:2431)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:57)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalContext.accept(SqlBaseParser.java:17308)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitIntervalLiteral(SqlBaseBaseVisitor.java:1581)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalLiteralContext.accept(SqlBaseParser.java:16929)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitConstantDefault(SqlBaseBaseVisitor.java:1511)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$ConstantDefaultContext.accept(SqlBaseParser.java:15905)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitValueExpressionDefault(SqlBaseBaseVisitor.java:1392)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$ValueExpressionDefaultContext.accept(SqlBaseParser.java:15298)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitPredicated$1(AstBuilder.scala:1548)
>   at 
> org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:1547)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:57)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$PredicatedContext.accept(SqlBaseParser.java:14745)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitExpression(SqlBaseBaseVisitor.java:1343)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$ExpressionContext.accept(SqlBaseParser.java:14606)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpression$1(AstBuilder.scala:1434)
>   at 
> org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:1433)
>   at 
> org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:57)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser$NamedExpressionContext.accept(SqlBaseParser.java:14124)
>   at 
> 

[jira] [Created] (SPARK-42553) NonReserved keyword "interval" can't be column name

2023-02-24 Thread jiang13021 (Jira)
jiang13021 created SPARK-42553:
--

 Summary: NonReserved keyword "interval" can't be column name
 Key: SPARK-42553
 URL: https://issues.apache.org/jira/browse/SPARK-42553
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.3
 Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
1.8.0_345)

Spark version 3.2.3-SNAPSHOT
Reporter: jiang13021


INTERVAL is a Non-Reserved keyword in spark. "Non-Reserved keywords" have a 
special meaning in particular contexts and can be used as identifiers in other 
contexts. So by design, interval can be used as a column name.
{code:java}
scala> spark.sql("select interval from mytable")
org.apache.spark.sql.catalyst.parser.ParseException:
at least one time unit should be given for interval literal(line 1, pos 7)== 
SQL ==
select interval from mytable
---^^^  at 
org.apache.spark.sql.errors.QueryParsingErrors$.invalidIntervalLiteralError(QueryParsingErrors.scala:196)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$parseIntervalLiteral$1(AstBuilder.scala:2481)
  at 
org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.parseIntervalLiteral(AstBuilder.scala:2466)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitInterval$1(AstBuilder.scala:2432)
  at 
org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:2431)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:57)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalContext.accept(SqlBaseParser.java:17308)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitIntervalLiteral(SqlBaseBaseVisitor.java:1581)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalLiteralContext.accept(SqlBaseParser.java:16929)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitConstantDefault(SqlBaseBaseVisitor.java:1511)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$ConstantDefaultContext.accept(SqlBaseParser.java:15905)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitValueExpressionDefault(SqlBaseBaseVisitor.java:1392)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$ValueExpressionDefaultContext.accept(SqlBaseParser.java:15298)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitPredicated$1(AstBuilder.scala:1548)
  at 
org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:1547)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:57)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$PredicatedContext.accept(SqlBaseParser.java:14745)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitExpression(SqlBaseBaseVisitor.java:1343)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$ExpressionContext.accept(SqlBaseParser.java:14606)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpression$1(AstBuilder.scala:1434)
  at 
org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:1433)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:57)
  at 
org.apache.spark.sql.catalyst.parser.SqlBaseParser$NamedExpressionContext.accept(SqlBaseParser.java:14124)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpressionSeq$2(AstBuilder.scala:628)
  at scala.collection.immutable.List.map(List.scala:293)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpressionSeq(AstBuilder.scala:628)
  at 
org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$withSelectQuerySpecification$1(AstBuilder.scala:734)
  at 

[jira] [Updated] (SPARK-42552) Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"

2023-02-24 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-42552:
---
Summary: Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"  (was: 
Got ParseException when run sql: "SELECT 1 UNION SELECT 1;")

> Get ParseException when run sql: "SELECT 1 UNION SELECT 1;"
> ---
>
> Key: SPARK-42552
> URL: https://issues.apache.org/jira/browse/SPARK-42552
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.3
> Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
> 1.8.0_345)
> Spark version 3.2.3-SNAPSHOT
>Reporter: jiang13021
>Priority: Minor
> Fix For: 3.2.3
>
>
> When I run sql
> {code:java}
> scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
> I get ParseException:
> {code:java}
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
> SELECT 1 UNION SELECT 1;
> ---^^^  at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
>   at 
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
>   at 
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
>   ... 47 elided
>  {code}
> If I run with parentheses , it works well 
> {code:java}
> scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
> res4: org.apache.spark.sql.DataFrame = [1: int]{code}
> This should be a bug
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42552) Got ParseException when run sql: "SELECT 1 UNION SELECT 1;"

2023-02-24 Thread jiang13021 (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiang13021 updated SPARK-42552:
---
Description: 
When I run sql
{code:java}
scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
I get ParseException:
{code:java}
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
SELECT 1 UNION SELECT 1;
---^^^  at 
org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
  at 
org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
  at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
  ... 47 elided
 {code}
If I run with parentheses , it works well 
{code:java}
scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
res4: org.apache.spark.sql.DataFrame = [1: int]{code}
This should be a bug

 

 

  was:
When I run sql
{code:java}
scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
I get ParseException:
{code:java}
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
SELECT 1 UNION SELECT 1;
---^^^  at 
org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
  at 
org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
  at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
  ... 47 elided
 {code}
If I run with parentheses , it works well 

 

 
{code:java}
scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
res4: org.apache.spark.sql.DataFrame = [1: int]{code}
This should be a bug

 

 


> Got ParseException when run sql: "SELECT 1 UNION SELECT 1;"
> ---
>
> Key: SPARK-42552
> URL: https://issues.apache.org/jira/browse/SPARK-42552
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.3
> Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
> 1.8.0_345)
> Spark version 3.2.3-SNAPSHOT
>Reporter: jiang13021
>Priority: Minor
> Fix For: 3.2.3
>
>
> When I run sql
> {code:java}
> scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
> I get ParseException:
> {code:java}
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
> SELECT 1 UNION SELECT 1;
> ---^^^  at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
>   at 
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
>   at 
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
>   ... 47 elided
>  {code}
> If I run with parentheses , it works well 
> {code:java}
> scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
> res4: org.apache.spark.sql.DataFrame = [1: int]{code}
> This should be a bug
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42552) Got ParseException when run sql: "SELECT 1 UNION SELECT 1;"

2023-02-24 Thread jiang13021 (Jira)
jiang13021 created SPARK-42552:
--

 Summary: Got ParseException when run sql: "SELECT 1 UNION SELECT 
1;"
 Key: SPARK-42552
 URL: https://issues.apache.org/jira/browse/SPARK-42552
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.3
 Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 
1.8.0_345)
Spark version 3.2.3-SNAPSHOT
Reporter: jiang13021
 Fix For: 3.2.3


When I run sql
{code:java}
scala> spark.sql("SELECT 1 UNION SELECT 1;") {code}
I get ParseException:
{code:java}
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'SELECT' expecting {, ';'}(line 1, pos 15)== SQL ==
SELECT 1 UNION SELECT 1;
---^^^  at 
org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:127)
  at 
org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:77)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:616)
  at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
  at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:616)
  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
  ... 47 elided
 {code}
If I run with parentheses , it works well 

 

 
{code:java}
scala> spark.sql("(SELECT 1) UNION (SELECT 1);") 
res4: org.apache.spark.sql.DataFrame = [1: int]{code}
This should be a bug

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org