[ 
https://issues.apache.org/jira/browse/CALCITE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16822511#comment-16822511
 ] 

Julian Hyde commented on CALCITE-2712:
--------------------------------------

Thanks for the refactorings, and for adding ProjectJoinJoinRemoveRule. I think 
there are still a couple of improvements to make your utility methods more 
general purpose:
* Rather than {{shift}}, could you use {{AggregateCall.transform}}? It handles 
all of the obscure stuff like filter and collation.
*  Could you make the {{canRemoveJoin}} more general purpose. Currently it is 
*almost* general purpose - its name says join but it is really just working in 
terms of field offset. Make them generate an {{ImmutableBitSet}} of fields 
used. Then the caller can decide whether those fields correspond to the left or 
right join inputs. See InputFinder and how is used by various utility methods.

> Add rule to remove null-generating side of a Join
> -------------------------------------------------
>
>                 Key: CALCITE-2712
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2712
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Chunwei Lei
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Add a rule to remove the null-generating side of a Join. Here is an example  
> where we eliminate the "many" side of a "one-to-many" join:
> {code:sql}
> # Example 1: one-to-many
> SELECT c.id, COUNT(DISTINCT o.productId)
> FROM Customers AS c
> LEFT JOIN SupportCalls AS s ON c.id = s.customerId
> LEFT JOIN Orders AS o ON c.id = o.customerId{code}
> We can remove {{SupportCalls}} and the join to it, so the query becomes
> {code:sql}
> SELECT c.id, COUNT(DISTINCT o.productId)
> FROM Customers AS c
> LEFT JOIN Orders AS o ON c.id = o.customerId{code}
> Necessary conditions are:
> # no columns from {{SupportCalls}} are used
> # the join is LEFT, so customers will not be eliminated if they have no 
> support calls,
> # there is an Aggregate on top, so we don't care if there are >1 support call 
> per customer.
> A simpler example of one-to-many:
> {code:sql}
> # Example 2: simpler one-to-many
> SELECT DISTINCT c.id
> FROM Customers AS c
> LEFT JOIN SupportCalls AS s ON c.id = s.customerId{code}
> An example of many-to-one, where we eliminate the "one" side ({{Orders}}):
> {code:sql}
> # Example 3: many-to-one
> SELECT c.id, p.color
> FROM LineItems AS i
> LEFT JOIN Orders AS o ON o.id = i.orderId
> LEFT JOIN Products AS p ON p.id = i.orderId{code}
> so that the query becomes
> {code:sql}
> SELECT c.id, p.color
> FROM LineItems AS i
> LEFT JOIN Products AS p ON p.id = i.orderId{code}
> Here, necessary side-conditions are:
> # no columns from {{Orders}} are used;
> # unique key on {{Orders.id}}.
> We do not require aggregation, because the primary key on {{Orders.id}} 
> ensures that {{Orders}} contributes at most one row.
> We can deal with similar cases like
> {code:sql}
> # Example 4: many-to-one, column aliasing required
> SELECT c.id, p.color
> FROM LineItems AS i
> LEFT JOIN Orders AS o ON o.id = i.orderId
> LEFT JOIN Products AS p ON p.id = o.id{code}
> if we use aliasing ({{o.id}} = {{i.orderId}}) and a foreign key that ensures 
> the existence of an record in {{Orders}}.
> For examples 1 and 2 (one-to-many), we would need to match {{Aggregate}} on 
> {{Join}}, therefore make a variant of {{AggregateJoinTransposeRule}}.
> For examples 3 and 4 (many-to-one or one-to-one)), we would match {{Project}} 
> on {{Join}}, therefore make a variant of {{ProjectJoinTransposeRule}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to