[ 
https://issues.apache.org/jira/browse/CALCITE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806083#comment-16806083
 ] 

Stamatis Zampetakis commented on CALCITE-2959:
----------------------------------------------

Postres and other major databases provide function indexes. The documentation 
of Postgres says that it is possible to define indexes on expressions involving 
on one or more columns of the table. I was a bit suprised to find out that the 
following is not possible in Postgres:

{code:sql}
CREATE INDEX ON person (address.city);
{code}

The idea of defining specific operators for struct types is nice but does not 
solve the problem of most complicated queries like the second example of 
[~rubenql].

I was thinking that when we create the EnhancedScan operator we could set the 
collation of the project on top to reflect that some columns are sorted. For 
instance:

{noformat}
Sort(1)  // --> Collation: [1]
  Project(0=$1, 1=$2.city) // --> Collation: [1]
    EnhancedScan(table=Person, sort=$2.city) // --> Collation: []
{noformat}

There are two problems with this approach.
# It appears that the project is doing the sort which is not the case.
# The collation of the project can easily get lost since rules tend to recreate 
operators and during creation the collation of the project will be derived from 
its input. 



> Support collation on struct fields
> ----------------------------------
>
>                 Key: CALCITE-2959
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2959
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Ruben Quesada Lopez
>            Priority: Major
>
> Currently, the class {{RelFieldCollation}} is used to define _"the ordering 
> of one field of a RelNode whose output is to be sorted"_. This representation 
> can hold only "simple" fields. In case of struct fields, a projection needs 
> to be applied in order to reference the struct field as a simple one. For 
> example, given this table:
> {code}
> CREATE TYPE Address AS (
>   street VARCHAR(20) NOT NULL, 
>   zipcode VARCHAR(20) NOT NULL,
>   city VARCHAR(20) NOT NULL);
> CREATE TABLE Person (
>   id VARCHAR(20) NOT NULL,
>   name VARCHAR(20) NOT NULL,
>   address Address NOT NULL);
> {code}
> With a SQL query such as: "{{SELECT p.name, p.address.city FROM Person p 
> ORDER BY p.address.city}}" the pseudo-plan generated would look like:
> {code}
> Sort(1)  // --> Collation: [1]
>   Project(0=$1, 1=$2.city)
>     Scan(table=Person)
> {code}
> However, what would happen if we had a specific Scan operator that would 
> guarantee us that the records would be scanned already ordered by 
> address.city? Something like:
> {code}
> EnhancedScan(table=Person, sort=$2.city)  // --> Collation???
> {code}
> The collation of such an operator cannot be represented with the current 
> Calcite capabilities (RelFieldCollation), because it would not be a "simple" 
> field, but a struct field, i.e. we would need a new collation abstraction to 
> represent it, e.g. [2.city] or [2.2]
> I would like to open the discuss to see if / how we could find a solution to 
> represent this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to