Re: Technical Queries Regarding Pushing Down Joins & Unions To TableScans, Conventions

Benchao Li Sat, 13 Aug 2022 20:28:26 -0700

Pranav,

Reposted your mail to the dev mailing list. (It's recommended to reply to
dev mailing list instead of someone's email address)


> So my question is better phrased as: "Is there a planner rule that
converts a join on 2 Table scans to some kind of table?" Are any examples
available on this? I tried looking for ?examples on the same, however; I
was able to observe pushdowns, but not what I just mentioned.

I don't know any existing rule does this for now.
You can add your own rule to achieve this, it won't be complex. You may
need to do more work to abstract out the interface to accept the 'Join
Pushdown'.



Pranav Deshpande <deshpande.v.pra...@gmail.com> 于2022年8月13日周六 01:21写道：

> HI Benchao,
> Thank you very much for your excellent suggestions!
>
> I believe that I have a better understanding about my problem now. I saw
> the pushdowns and also the usage of multiple conventions in the same plan,
> and was able to implement and play around with the same.
>
> I went through talk [4] as well.
>
> So my question is better phrased as: "Is there a planner rule that
> converts a join on 2 Table scans to some kind of table?" Are any examples
> available on this? I tried looking for examples on the same, however; I was
> able to observe pushdowns, but not what I just mentioned.
>
> This is similar to the ProjectTableScan rule in some sense, it converts a
> project on a TableScan to a ProjectableFilterableTable SCAN.
>
> Thanks & Regards,
> Pranav
>
> On Fri, Aug 5, 2022 at 11:57 PM Benchao Li <libenc...@apache.org> wrote:
>
>> Pranav,
>>
>> You can reference Calcite adaptors implementation, such as JDBC
>> Adaptor[1], MongoDB Adaptor[2].
>> Their implementation allows pushing down operations (RelNode) to the
>> adapter as much as possible, and the left RelNodes will be implemented
>> using Enumerable Convention.
>>
>> We have a Converter[3] concept which makes this possible. The converter
>> node allows multiple Conventions in a single query. This also answers your
>> second question.
>> There is a talk[4] about this, it's very helpful for understanding this
>> concept.
>>
>> [1]
>> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcToEnumerableConverter.java
>> [2]
>> https://github.com/apache/calcite/blob/main/mongodb/src/main/java/org/apache/calcite/adapter/mongodb/MongoToEnumerableConverter.java
>> [3]
>> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/convert/Converter.java
>> [4]
>> https://calcite.apache.org/community/#fast-federated-sql-with-apache-calcite
>>
>> Julian Hyde <jhyde.apa...@gmail.com> 于2022年8月6日周六 02:06写道：
>>
>>> Pranav,
>>>
>>> Please subscribe to this list. You have asked several questions,
>>> received replies, not acknowledged those replies, and asked further
>>> questions. Also, since you are not subscribed, each email you post has to
>>> go through manual moderation.
>>>
>>> Julian
>>>
>>> > On Aug 5, 2022, at 9:38 AM, Pranav Deshpande <
>>> deshpande.v.pra...@gmail.com> wrote:
>>> >
>>> > Dear Apache Calcite Team,
>>> > I have 2 questions.
>>> >
>>> > ---------------------------------
>>> > 1.
>>> >
>>> > There are plenty of examples on how to push down projects and filters
>>> into
>>> > the leaf nodes (tablescans).
>>> >
>>> > However, I could not find any examples to push down joins to
>>> TableScans (or
>>> > joins+filters+projects etc.) [this is helpful for data federation I
>>> think].
>>> >
>>> > On the mailing list, many folks are suggesting that I use Drill.
>>> However,
>>> > the purpose of my exercise is to gain knowledge about DBMS and Query
>>> > processing etc.
>>> >
>>> > I tried debugging open source engines that use Calcite (Drill, Druid,
>>> Trino
>>> > etc.) but was completely lost.
>>> >
>>> > Any examples/pointers/guidance around the same would be appreciated.
>>> > Example, pushing down a join with a filter to a DBMS(consider jdbc
>>> > msql etc.)
>>> >
>>> > -------------------------------------
>>> > 2.
>>> >
>>> > The 2nd question I have is regarding conventions and different DBMS.
>>> The
>>> > cluster has a method to replace the trait convention(Bindable,JDBC
>>> etc.),
>>> > and then we optimize and get the physical plan.
>>> >
>>> > But imagine I have both the MYSQL JDBC convnction and a cassandra
>>> > convention and some user is trying to query both tables.
>>> >
>>> > Something like "SELECT users.username, specialdata.country from
>>> > cassandraDB.user join mysqlDB.specialdata ON users.id
>>> =specialdata.userid"
>>> >
>>> > Now, how will calcite do the optimization here? The planner is not
>>> > accepting 2 different conventions.
>>> >
>>> > Thanks & Regards,
>>> > Pranav
>>>
>>>
>>
>> --
>>
>> Best,
>> Benchao Li
>>
>

-- 

Best,
Benchao Li

Re: Technical Queries Regarding Pushing Down Joins & Unions To TableScans, Conventions

Reply via email to