Vladimir, You are right Phoenix integration with Calcite stalled halfway. See [1] to get some reasons.
[1] https://lists.apache.org/thread.html/0152a97bfebb85c74f10e26e94ab9cd416dec374abba7dc2e1af9d61%40%3Cdev.phoenix.apache.org%3E ср, 11 дек. 2019 г. в 17:11, Vladimir Ozerov <ppoze...@gmail.com>: > > Roman, > > What is the advantage of Phoenix approach then? BTW, it looks like Phoenix > integration with Calcite never made it to production, did it? > > вт, 10 дек. 2019 г. в 19:50, Roman Kondakov <kondako...@mail.ru.invalid>: > > > Hi Vladimir, > > > > from what I understand, Drill does not exploit collation of indexes. To > > be precise it does not exploit index collation in "natural" way where, > > say, we a have sorted TableScan and hence we do not create a new Sort. > > Instead of it Drill always create a Sort operator, but if TableScan can > > be replaced with an IndexScan, this Sort operator is removed by the > > dedicated rule. > > > > Lets consider initial an operator tree: > > > > Project > > Sort > > TableScan > > > > after applying rule DbScanToIndexScanPrule this tree will be converted to: > > > > Project > > Sort > > IndexScan > > > > and finally, after applying DbScanSortRemovalRule we have: > > > > Project > > IndexScan > > > > while for Phoenix approach we would have two equivalent subsets in our > > planner: > > > > Project > > Sort > > TableScan > > > > and > > > > Project > > IndexScan > > > > and most likely the last plan will be chosen as the best one. > > > > -- > > Kind Regards > > Roman Kondakov > > > > > > On 10.12.2019 17:19, Vladimir Ozerov wrote: > > > Hi Roman, > > > > > > Why do you think that Drill-style will not let you exploit collation? > > > Collation should be propagated from the index scan in the same way as in > > > other sorted operators, such as merge join or streaming aggregate. > > Provided > > > that you use converter-hack (or any alternative solution to trigger > > parent > > > re-analysis). > > > In other words, propagation of collation from Drill-style indexes should > > be > > > no different from other sorted operators. > > > > > > Regards, > > > Vladimir. > > > > > > вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky > > <arzamas...@mail.ru.invalid > > >> : > > > > > >> > > >> Roman just as fast remark, Phoenix builds their approach on > > >> already existing monolith HBase architecture, most cases it`s just a > > stub > > >> for someone who wants use secondary indexes with a base with no > > >> native support of it. Don`t think it`s good idea here. > > >> > > >>> > > >>> > > >>> ------- Forwarded message ------- > > >>> From: "Roman Kondakov" < kondako...@mail.ru.invalid > > > >>> To: dev@ignite.apache.org > > >>> Cc: > > >>> Subject: Adding support for Ignite secondary indexes to Apache Calcite > > >>> planner > > >>> Date: Tue, 10 Dec 2019 15:55:52 +0300 > > >>> > > >>> Hi all! > > >>> > > >>> As you may know there is an activity on integration of Apache Calcite > > >>> query optimizer into Ignite codebase is being carried out [1],[2]. > > >>> > > >>> One of a bunch of problems in this integration is the absence of > > >>> out-of-the-box support for secondary indexes in Apache Calcite. After > > >>> some research I came to conclusion that this problem has a couple of > > >>> workarounds. Let's name them > > >>> 1. Phoenix-style approach - representing secondary indexes as > > >>> materialized views which are natively supported by Calcite engine [3] > > >>> 2. Drill-style approach - pushing filters into the table scans and > > >>> choose appropriate index for lookups when possible [4] > > >>> > > >>> Both these approaches have advantages and disadvantages: > > >>> > > >>> Phoenix style pros: > > >>> - natural way of adding indexes as an alternative source of rows: index > > >>> can be considered as a kind of sorted materialized view. > > >>> - possibility of using index sortedness for stream aggregates, > > >>> deduplication (DISTINCT operator), merge joins, etc. > > >>> - ability to support other types of indexes (i.e. functional indexes). > > >>> > > >>> Phoenix style cons: > > >>> - polluting optimizer's search space extra table scans hence increasing > > >>> the planning time. > > >>> > > >>> Drill style pros: > > >>> - easier to implement (although it's questionable). > > >>> - search space is not inflated. > > >>> > > >>> Drill style cons: > > >>> - missed opportunity to exploit sortedness. > > >>> > > >>> There is a good discussion about using both approaches can be found in > > >> [5]. > > >>> > > >>> I made a small sketch [6] in order to demonstrate the applicability of > > >>> the Phoenix approach to Ignite. Key design concepts are: > > >>> 1. On creating indexes are registered as tables in Calcite schema. This > > >>> step is needed for internal Calcite's routines. > > >>> 2. On planner initialization we register these indexes as materialized > > >>> views in Calcite's optimizer using VolcanoPlanner#addMaterialization > > >>> method. > > >>> 3. Right before the query execution Calcite selects all materialized > > >>> views (indexes) which can be potentially used in query. > > >>> 4. During the query optimization indexes are registered by planner as > > >>> usual TableScans and hence can be chosen by optimizer if they have > > lower > > >>> cost. > > >>> > > >>> This sketch shows the ability to exploit index sortedness only. So the > > >>> future work in this direction should be focused on using indexes for > > >>> fast index lookups. At first glance FilterableTable and > > >>> FilterTableScanRule are good points to start. We can push Filter into > > >>> the TableScan and then use FilterableTable for fast index lookups > > >>> avoiding reading the whole index on TableScan step and then filtering > > >>> its output on the Filter step. > > >>> > > >>> What do you think? > > >>> > > >>> > > >>> > > >>> [1] > > >>> > > >> > > http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-tt43724.html#none > > >>> [2] > > >>> > > >> > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine > > >>> [3] https://issues.apache.org/jira/browse/PHOENIX-2047 > > >>> [4] https://issues.apache.org/jira/browse/DRILL-6381 > > >>> [5] https://issues.apache.org/jira/browse/DRILL-3929 > > >>> [6] https://github.com/apache/ignite/pull/7115 > > >> > > >> > > >> > > >> > > > > > -- Best regards, Ivan Pavlukhin