[jira] [Created] (SPARK-49819) Disable CollapseProject for correlated subqueries in projection over aggregate correctly

2024-09-27 Thread Nick Young (Jira)
Nick Young created SPARK-49819:
--

 Summary: Disable CollapseProject for correlated subqueries in 
projection over aggregate correctly
 Key: SPARK-49819
 URL: https://issues.apache.org/jira/browse/SPARK-49819
 Project: Spark
  Issue Type: Bug
  Components: Optimizer
Affects Versions: 4.0.0
Reporter: Nick Young


CollapseProject should block collapsing with an aggregate if any correlated 
subquery is present. There are other correlated subqueries that are not 
ScalarSubquery that are not accounted for here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-49699) Disable unsuitable Optimizer rules for streaming and side-effect subplans

2024-09-18 Thread Nick Young (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-49699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882771#comment-17882771
 ] 

Nick Young commented on SPARK-49699:


Assign to [~n-young-db] 

> Disable unsuitable Optimizer rules for streaming and side-effect subplans
> -
>
> Key: SPARK-49699
> URL: https://issues.apache.org/jira/browse/SPARK-49699
> Project: Spark
>  Issue Type: Story
>  Components: Optimizer
>Affects Versions: 4.0.0
>Reporter: Nick Young
>Priority: Major
>
> Various optimizer rules are unsuitable for the streaming or side-effect 
> settings. Disable them, and roll out the disablement with care as to not 
> break existing queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-49699) Disable unsuitable Optimizer rules for streaming and side-effect subplans

2024-09-18 Thread Nick Young (Jira)
Nick Young created SPARK-49699:
--

 Summary: Disable unsuitable Optimizer rules for streaming and 
side-effect subplans
 Key: SPARK-49699
 URL: https://issues.apache.org/jira/browse/SPARK-49699
 Project: Spark
  Issue Type: Story
  Components: Optimizer
Affects Versions: 4.0.0
Reporter: Nick Young


Various optimizer rules are unsuitable for the streaming or side-effect 
settings. Disable them, and roll out the disablement with care as to not break 
existing queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48915) Add inequality (!=, <, <=, >, >=) predicates for correlation in GeneratedSubquerySuite

2024-07-16 Thread Nick Young (Jira)
Nick Young created SPARK-48915:
--

 Summary: Add inequality (!=, <, <=, >, >=) predicates for 
correlation in GeneratedSubquerySuite
 Key: SPARK-48915
 URL: https://issues.apache.org/jira/browse/SPARK-48915
 Project: Spark
  Issue Type: Improvement
  Components: Optimizer
Affects Versions: 4.0.0
Reporter: Nick Young


{{GeneratedSubquerySuite}} is a test suite that generates SQL with variations 
of subqueries. Currently, the operators supported are Joins, Set Operations, 
Aggregate (with/without group by) and Limit. Implementing inequality (!=, <, 
<=, >, >=) predicates will increase coverage by 1 additional axis, and should 
be simple.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48656) ArrayIndexOutOfBoundsException in CartesianRDD getPartitions

2024-06-18 Thread Nick Young (Jira)
Nick Young created SPARK-48656:
--

 Summary: ArrayIndexOutOfBoundsException in CartesianRDD 
getPartitions
 Key: SPARK-48656
 URL: https://issues.apache.org/jira/browse/SPARK-48656
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Nick Young


```val rdd1 = spark.sparkContext.parallelize(Seq(1, 2, 3), numSlices = 65536)
val rdd2 = spark.sparkContext.parallelize(Seq(1, 2, 3), numSlices = 
65536)rdd2.cartesian(rdd1).partitions```

Throws `ArrayIndexOutOfBoundsException: 0` at CartesianRDD.scala:69 because 
`s1.index * numPartitionsInRdd2 + s2.index` overflows and wraps to 0. We should 
provide a better error message which indicates the number of partition 
overflows so it's easier for the user to debug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-46693) Inject LocalLimitExec when matching OffsetAndLimit or LimitAndOffset

2024-01-11 Thread Nick Young (Jira)
Nick Young created SPARK-46693:
--

 Summary: Inject LocalLimitExec when matching OffsetAndLimit or 
LimitAndOffset
 Key: SPARK-46693
 URL: https://issues.apache.org/jira/browse/SPARK-46693
 Project: Spark
  Issue Type: Improvement
  Components: Optimizer
Affects Versions: 3.5.0
Reporter: Nick Young


For queries containing both a LIMIT and an OFFSET in a subquery, physical 
translation will drop the `LocalLimit` planned in the optimizer stage by 
mistake; this manifests as larger than necessary shuffle sizes for 
`GlobalLimitExec`. Fix to not drop this node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org