[
https://issues.apache.org/jira/browse/CALCITE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090382#comment-18090382
]
Julian Hyde edited comment on CALCITE-7608 at 6/20/26 6:44 PM:
---------------------------------------------------------------
There seems to me massive overlap between Correlate, Unnest and the proposed
SelectMany. Is it just a case of people not understanding what we already have?
(By the way, if we move forward with this, it should be called {{ProjectMany}}.
While LINQ (and linq4j in Calcite) calls it "SelectMany", Calcite avoids the
word "select" in its operators because it means "project" to some people and
"filter" to others. In functional languages this same operator is called
{{flatMap}}.)
In relational algebras, you need to choose your code set of operators, and
whether you include ProjectMany takes you in one of two directions. LINQ has
ProjectMany (which it calls SelectMany) and expresses Project, Filter and Join
in terms of ProjectMany.
The other direction is what Thomas Neumann calls [dependent
joins|https://www.cs.cmu.edu/~15721-f24/papers/Story_of_Joins.pdf]. Calcite
calls these Correlate; Microsoft SQL leans heavily on lateral joins and [CROSS
APPLY|https://15799.courses.cs.cmu.edu/spring2025/papers/19-udfs/p432-ramachandra.pdf].
I recently added {{yieldAll}} to Morel, so the tradeoffs are still on my mind.
I chose to make it desugar to a [dependent join followed by a
project|https://github.com/hydromatic/morel/issues/257]. To understand where I
am at, I recommend that people read that thread.
was (Author: julianhyde):
There seems to me massive overlap between Correlate, Unnest and the proposed
SelectMany. Is it just a case of people not understanding what we already have?
(By the way, if we move forward with this, it should be called {{ProjectMany}}.
While LINQ (and linq4j in Calcite) calls it "SelectMany", Calcite avoids the
word "select" in its operators because it means "project" to some people and
"filter" to others. In functional languages this same operator is called
{{flatMap}}.)
In relational algebras, you need to choose your code set of operators, and
whether you include ProjectMany takes you in one of two directions. LINQ has
ProjectMany (which it calls SelectMany) and expresses Project, Filter and Join
in terms of ProjectMany.
The other direction is what Thomas Neumann calls [dependent
joins](https://www.cs.cmu.edu/~15721-f24/papers/Story_of_Joins.pdf). Calcite
calls these Correlate; Microsoft SQL leans heavily on lateral joins and [CROSS
APPLY](https://15799.courses.cs.cmu.edu/spring2025/papers/19-udfs/p432-ramachandra.pdf).
I recently added {{yieldAll}} to Morel, so the tradeoffs are still on my mind.
I chose to make it desugar to a [dependent join followed by a
project](https://github.com/hydromatic/morel/issues/257).
> Introduce a SelectMany operator
> -------------------------------
>
> Key: CALCITE-7608
> URL: https://issues.apache.org/jira/browse/CALCITE-7608
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.42.0
> Reporter: Mihai Budiu
> Assignee: Mihai Budiu
> Priority: Minor
> Labels: pull-request-available
>
> Today UNNEST is implemented using the Uncollect operator. We propose adding
> an alternative LogicalSelectMany operator, which generalizes Uncollect.
> (Notice that Enumerable API already has a SelectMany.) The main difference
> between Uncollect and SelectMany is that Uncollect unnests all the fields of
> its input relation, whereas LogicalSelectMany would only unnest SOME of the
> fields of the input collection, preserving the other ones in each output row.
> This distinction is very important, because:
> * LogicalSelectMany can be directly and efficiently implemented using the
> Enumerable SelectMany
> * UNNEST used in a cross-join is implemented using an Uncollect and a
> LogicalCorrelate. However, the same UNNEST can be represented using just one
> LogicalSelectMany node
> * Neither the old nor the new decorrelator can actually eliminate
> LogicalCorrelate nodes that are paired with Uncollect. Using
> LogicalSelectMany we can decorrelate many more plans.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)