Вадим, Yes, we are short of documentation at the moment.
There is source code for adapters with processing aggregates, for example, > Mongodb, This is another different topic, adaptors such as Mongo/JDBC, they transformed all the RelNodes to their own convention, and then translate the RelNodes to their own query dialect. They do not pushdown filter/projection/aggregate, but they invent a method to implement the whole RelNode tree. Вадим Ахмедов <akhmedov.va...@gmail.com> 于2022年6月22日周三 15:09写道: > Hi Benchao, > Thank you very much for your reply. Unfortunately, I did not find what you > wrote in the Calcite documentation. It seems to be very sketchy. There is > source code for adapters with processing aggregates, for example, Mongodb, > but to understand thoroughly this it needs to spend a lot of time, which is > often not enough. If there were examples in documentation explaining the > minimum implementation of pushdown projections, the minimum implementation > of pushdown filters, and the same for aggregates, it would be very helpful. > > сб, 18 июн. 2022 г. в 11:50, Benchao Li <libenc...@apache.org>: > > > Hi Вадим, > > > > I'd like to share how the projections and filters are pushed down > > in the first place. > > > > 1. Firstly we should have a RelNode which can do projections and > > filters, and in Calcite, this is done by BindableTableScan[1]. > > 2. Then we need a rule to match such as Filter/Project on top of Scan, > > and push the filters into the Scan, and in Calcite this is done > > by FilterTableScanRule[2] and ProjectTableScanRule[3]. > > 3. Finally, we should translate the Scan with filters and/or projections > > to a executable form, this may be different for different projections > > because they have their own physical representations. In Calcite, > > BindableTableScan will be transformed to TableScanNode[4], which > > will further push filters and projections into > > ProjectableFilterableTable[5]. > > > > Hence, to extend Calcite to push aggregations into Scan, you need > > the same process. You need a physical Scan node which can do > aggregations, > > and a rule to match Aggregate on top of Scan to push it down. Then you > also > > need to implement the corresponding physical logics. > > > > If you want the Scan node to do all the projection/filter/aggregation > > pushdown, > > you need to be careful to deal with the mix of them, because generally > they > > are not pushed down in one go, e.g. you may push a aggregation into a > Scan > > which has been pushed the filters down. > > > > Hope this helps~ > > > > [1] > > > > > https://github.com/apache/calcite/blob/de41df4d117041fbee042e07f70e6043f1fe626d/core/src/main/java/org/apache/calcite/interpreter/Bindables.java#L207 > > [2] > > > > > https://github.com/apache/calcite/blob/de41df4d117041fbee042e07f70e6043f1fe626d/core/src/main/java/org/apache/calcite/rel/rules/FilterTableScanRule.java#L57 > > [3] > > > > > https://github.com/apache/calcite/blob/de41df4d117041fbee042e07f70e6043f1fe626d/core/src/main/java/org/apache/calcite/rel/rules/ProjectTableScanRule.java#L57 > > [4] > > > > > https://github.com/apache/calcite/blob/de41df4d117041fbee042e07f70e6043f1fe626d/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L63 > > [5] > > > > > https://github.com/apache/calcite/blob/de41df4d117041fbee042e07f70e6043f1fe626d/core/src/main/java/org/apache/calcite/schema/ProjectableFilterableTable.java#L38 > > > > Вадим Ахмедов <akhmedov.va...@gmail.com> 于2022年6月17日周五 16:59写道: > > > > > Hi! > > > > > > I'm modifying a driver based on Apache Calcite that works with AWS S3 > > > storage using SQL queries. The interaction with S3 storage uses the S3 > > > Select dialect which is very similar to SQL. The driver uses > > > ProjectableFilterableTable to scan CSV data loaded from AWS. The > filters > > as > > > a list of RexNodes are used in the scan method to transform SQL queries > > > into AWS S3 Select queries. Thus push down of projects and filters is > > done > > > into requests to the S3 storage. > > > > > > Now I need to modify the driver in such a way that the push down of > > > aggregate functions additionally occurs. > > > > > > Calcite documentation has a hint: > > > "If you want more control, you should write a planner rule. This will > > allow > > > you to push down expressions, to make a cost-based decision about > whether > > > to push down processing, and push down more complex operations such as > > > join, aggregation, and sort." > > > > > > I really need advice on how I can push down the aggregate functions > with > > > minimal modification of the driver source code. I have to ignore the > > > aggregate functions in SQL somehow and push them into queries in S3 > > Select > > > so that the aggregation occurs on the S3 side and not in memory. > > > > > > If I try to replace ProjectableFilterableTable with TranslatableTable > the > > > code will become 10 times more complicated. > > > > > > Maybe there is some simpler way to push down the aggregates? > > > > > > If TranslatableTable is the only way to solve this problem, what > > > minimalistic example can I use for this? > > > > > > Driver source code > > > https://github.com/amannm/lake-driver > > > > > > Thanks, > > > Vadim A. > > > > > > > > > -- > > > > Best, > > Benchao Li > > > -- Best, Benchao Li