[
https://issues.apache.org/jira/browse/SPARK-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Reynold Xin updated SPARK-9703:
-------------------------------
Sprint: Spark 1.5 release
> EnsureRequirements should not add unnecessary shuffles when only ordering
> requirements are unsatisfied
> ------------------------------------------------------------------------------------------------------
>
> Key: SPARK-9703
> URL: https://issues.apache.org/jira/browse/SPARK-9703
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.3.0, 1.4.0, 1.5.0
> Reporter: Josh Rosen
> Assignee: Josh Rosen
>
> Consider SortMergeJoin, which requires a sorted, clustered distribution of
> its input rows. Say that both of SMJ's children produce unsorted output but
> are both single partition. In this case, we will need to inject sort
> operators but should not need to inject exchanges. Unfortunately, it looks
> like the Exchange unnecessarily repartitions using a hash partitioning.
> We should update Exchange so that it does not unnecessarily repartition
> children when only the ordering requirements are unsatisfied.
> I'd like to fix this for Spark 1.5 since it makes certain types of unit tests
> easier to write.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]