Re: Issues in exposing data via TableFunction vs TableMacro

2019-09-12 Thread Gabriel Reid
Hi Julian,

On Tue, Sep 10, 2019 at 5:19 PM Julian Feinauer <
j.feina...@pragmaticminds.de> wrote:

>
> when going through the Code I just had another Idea.
> Currently a TableFunction is executed as EnumerableTableFunctionScan which
> gets generated from a LogicalTableFunctionScan by the Rule
> EnumerableTableFunctionScanRule.
> What if you just remove that Rule and add a custom Rule of yours which
> translates it of a TableScan of your taste?
>
>

That is indeed something I had thought about.

In reference to your earlier question, yes, there are parameters needed for
the TableFunction, so I think that the DrillTable approach wouldn't work in
my case.

I didn't see any easy way to alter rules outside of RelNode.register, but I
assume that that is indeed possible, so I'll look further into this
approach. Thanks for the advice.

- Gabriel





> Julian
>
> Am 10.09.19, 08:13 schrieb "Julian Feinauer" <
> j.feina...@pragmaticminds.de>:
>
> Hi Gabriel,
>
> thats an interesting question for me too.
> Do you need parameters for those "dynamic tables"?
> If not you could do it similar to what Drill is doing and just
> implement a Schema which always returns "true" if someone asks for a Table
> and then returns a Table Implementation that you provide where you can hook
> in later and add the functionality that you in fact need. This can then
> also be used in optimization as you can then control your custom Table type.
> Perhaps it helps to look at the DrillTable class in [1].
>
> On a Side node I try to figure out what would be necessary to make
> TableFunction wo also work with TranslatableTable.
> Would you mind opening an issue in Jira for that?
>
> Julian
>
> [1]
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java
>
> Am 10.09.19, 03:25 schrieb "Gabriel Reid" :
>
> Hi,
>
> I'm currently using a combination of TableFunctions and
> TableMacros to
> expose various dynamic (relatively unstructured) data sources via
> Calcite.
> The underlying data sources are such that data can only be
> retrieved by
> first specifying what you want (i.e. there is no catalog of all
> data that
> is available).
>
> I'm currently handling this by using a combination of
> TableFunctions and
> TableMacros.
>
> The issue that I'm running into comes when I want to implement
> custom
> planner rules for the underlying functionality. As far I as I can
> see, it's
> not possible to register planner rules based on a
> TableFunctionImpl,
> because a TableFunctionImpl only exposes a ScannableTable, so
> there's no
> chance to hook into RelOptNode.register.
>
> On the other hand, implementing a TableMacro does allow to return a
> TranslatableTable, which then does allow intercepting the call to
> RelOptNode.register to register rules. However, TableMacros
> require that
> all parameters are literals, and I'm providing timestamps, via
> TIMESTAMPADD() and CURRENT_TIMESTAMP() calls, which then doesn't
> work for
> TableMacros (all parameters to a table macro need to be literals,
> otherwise
> query validation fails in Calcite).
>
> I'm wondering if I'm missing some built-in functionality which
> would make
> it possible to have a dynamic table function/macro that can also be
> manipulated via custom planner rules.
>
> Options (which may or may not exist) that I can think of are:
> * something that would/could visit all macro parameters ahead of
> time and
> resolve things like CURRENT_TIMESTAMP() to a literal, before
> further query
> validation occurs
> * register rules somewhere outside of RelOptNode.register (e.g.
> when the
> schema is first created)
>
> Are there any currently-working options in Calcite that can help
> me do what
> I'm trying to do? And if there aren't and I would add such a thing
> to
> Calcite, are there any suggestions as to what the most appropriate
> approach
> would be (either one of the two options I listed above, or
> something else)?
>
> Thanks,
>
> Gabriel
>
>
>
>
>


Re: Issues in exposing data via TableFunction vs TableMacro

2019-09-10 Thread Julian Hyde
I always thought of a table function as a lightweight relational operator. You 
could write your own UNION function, for instance. But the moment you want it 
to start participating in algebraic rewrites - if you want to push filters 
through it, for instance - then you had better make it into a relational 
operator (that is, a sub-class of RelNode). I think you’re at that point. 

A table macro is powerful, but it has been expanded before the planner rules 
start to be applied, so can’t help. 

> On Sep 10, 2019, at 8:18 AM, Julian Feinauer  
> wrote:
> 
> Hey,
> 
> when going through the Code I just had another Idea.
> Currently a TableFunction is executed as EnumerableTableFunctionScan which 
> gets generated from a LogicalTableFunctionScan by the Rule 
> EnumerableTableFunctionScanRule.
> What if you just remove that Rule and add a custom Rule of yours which 
> translates it of a TableScan of your taste?
> 
> Julian
> 
> Am 10.09.19, 08:13 schrieb "Julian Feinauer" :
> 
>Hi Gabriel,
> 
>thats an interesting question for me too.
>Do you need parameters for those "dynamic tables"?
>If not you could do it similar to what Drill is doing and just implement a 
> Schema which always returns "true" if someone asks for a Table and then 
> returns a Table Implementation that you provide where you can hook in later 
> and add the functionality that you in fact need. This can then also be used 
> in optimization as you can then control your custom Table type.
>Perhaps it helps to look at the DrillTable class in [1].
> 
>On a Side node I try to figure out what would be necessary to make 
> TableFunction wo also work with TranslatableTable.
>Would you mind opening an issue in Jira for that?
> 
>Julian
> 
>[1] 
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java
>  
> 
>Am 10.09.19, 03:25 schrieb "Gabriel Reid" :
> 
>Hi,
> 
>I'm currently using a combination of TableFunctions and TableMacros to
>expose various dynamic (relatively unstructured) data sources via 
> Calcite.
>The underlying data sources are such that data can only be retrieved by
>first specifying what you want (i.e. there is no catalog of all data 
> that
>is available).
> 
>I'm currently handling this by using a combination of TableFunctions 
> and
>TableMacros.
> 
>The issue that I'm running into comes when I want to implement custom
>planner rules for the underlying functionality. As far I as I can see, 
> it's
>not possible to register planner rules based on a TableFunctionImpl,
>because a TableFunctionImpl only exposes a ScannableTable, so there's 
> no
>chance to hook into RelOptNode.register.
> 
>On the other hand, implementing a TableMacro does allow to return a
>TranslatableTable, which then does allow intercepting the call to
>RelOptNode.register to register rules. However, TableMacros require 
> that
>all parameters are literals, and I'm providing timestamps, via
>TIMESTAMPADD() and CURRENT_TIMESTAMP() calls, which then doesn't work 
> for
>TableMacros (all parameters to a table macro need to be literals, 
> otherwise
>query validation fails in Calcite).
> 
>I'm wondering if I'm missing some built-in functionality which would 
> make
>it possible to have a dynamic table function/macro that can also be
>manipulated via custom planner rules.
> 
>Options (which may or may not exist) that I can think of are:
>* something that would/could visit all macro parameters ahead of time 
> and
>resolve things like CURRENT_TIMESTAMP() to a literal, before further 
> query
>validation occurs
>* register rules somewhere outside of RelOptNode.register (e.g. when 
> the
>schema is first created)
> 
>Are there any currently-working options in Calcite that can help me do 
> what
>I'm trying to do? And if there aren't and I would add such a thing to
>Calcite, are there any suggestions as to what the most appropriate 
> approach
>would be (either one of the two options I listed above, or something 
> else)?
> 
>Thanks,
> 
>Gabriel
> 
> 
> 
> 


Re: Issues in exposing data via TableFunction vs TableMacro

2019-09-10 Thread Julian Feinauer
Hey,

when going through the Code I just had another Idea.
Currently a TableFunction is executed as EnumerableTableFunctionScan which gets 
generated from a LogicalTableFunctionScan by the Rule 
EnumerableTableFunctionScanRule.
What if you just remove that Rule and add a custom Rule of yours which 
translates it of a TableScan of your taste?

Julian

Am 10.09.19, 08:13 schrieb "Julian Feinauer" :

Hi Gabriel,

thats an interesting question for me too.
Do you need parameters for those "dynamic tables"?
If not you could do it similar to what Drill is doing and just implement a 
Schema which always returns "true" if someone asks for a Table and then returns 
a Table Implementation that you provide where you can hook in later and add the 
functionality that you in fact need. This can then also be used in optimization 
as you can then control your custom Table type.
Perhaps it helps to look at the DrillTable class in [1].

On a Side node I try to figure out what would be necessary to make 
TableFunction wo also work with TranslatableTable.
Would you mind opening an issue in Jira for that?

Julian

[1] 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java
 

Am 10.09.19, 03:25 schrieb "Gabriel Reid" :

Hi,

I'm currently using a combination of TableFunctions and TableMacros to
expose various dynamic (relatively unstructured) data sources via 
Calcite.
The underlying data sources are such that data can only be retrieved by
first specifying what you want (i.e. there is no catalog of all data 
that
is available).

I'm currently handling this by using a combination of TableFunctions and
TableMacros.

The issue that I'm running into comes when I want to implement custom
planner rules for the underlying functionality. As far I as I can see, 
it's
not possible to register planner rules based on a TableFunctionImpl,
because a TableFunctionImpl only exposes a ScannableTable, so there's no
chance to hook into RelOptNode.register.

On the other hand, implementing a TableMacro does allow to return a
TranslatableTable, which then does allow intercepting the call to
RelOptNode.register to register rules. However, TableMacros require that
all parameters are literals, and I'm providing timestamps, via
TIMESTAMPADD() and CURRENT_TIMESTAMP() calls, which then doesn't work 
for
TableMacros (all parameters to a table macro need to be literals, 
otherwise
query validation fails in Calcite).

I'm wondering if I'm missing some built-in functionality which would 
make
it possible to have a dynamic table function/macro that can also be
manipulated via custom planner rules.

Options (which may or may not exist) that I can think of are:
* something that would/could visit all macro parameters ahead of time 
and
resolve things like CURRENT_TIMESTAMP() to a literal, before further 
query
validation occurs
* register rules somewhere outside of RelOptNode.register (e.g. when the
schema is first created)

Are there any currently-working options in Calcite that can help me do 
what
I'm trying to do? And if there aren't and I would add such a thing to
Calcite, are there any suggestions as to what the most appropriate 
approach
would be (either one of the two options I listed above, or something 
else)?

Thanks,

Gabriel






Re: Issues in exposing data via TableFunction vs TableMacro

2019-09-10 Thread Julian Feinauer
Hi Gabriel,

thats an interesting question for me too.
Do you need parameters for those "dynamic tables"?
If not you could do it similar to what Drill is doing and just implement a 
Schema which always returns "true" if someone asks for a Table and then returns 
a Table Implementation that you provide where you can hook in later and add the 
functionality that you in fact need. This can then also be used in optimization 
as you can then control your custom Table type.
Perhaps it helps to look at the DrillTable class in [1].

On a Side node I try to figure out what would be necessary to make 
TableFunction wo also work with TranslatableTable.
Would you mind opening an issue in Jira for that?

Julian

[1] 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java
 

Am 10.09.19, 03:25 schrieb "Gabriel Reid" :

Hi,

I'm currently using a combination of TableFunctions and TableMacros to
expose various dynamic (relatively unstructured) data sources via Calcite.
The underlying data sources are such that data can only be retrieved by
first specifying what you want (i.e. there is no catalog of all data that
is available).

I'm currently handling this by using a combination of TableFunctions and
TableMacros.

The issue that I'm running into comes when I want to implement custom
planner rules for the underlying functionality. As far I as I can see, it's
not possible to register planner rules based on a TableFunctionImpl,
because a TableFunctionImpl only exposes a ScannableTable, so there's no
chance to hook into RelOptNode.register.

On the other hand, implementing a TableMacro does allow to return a
TranslatableTable, which then does allow intercepting the call to
RelOptNode.register to register rules. However, TableMacros require that
all parameters are literals, and I'm providing timestamps, via
TIMESTAMPADD() and CURRENT_TIMESTAMP() calls, which then doesn't work for
TableMacros (all parameters to a table macro need to be literals, otherwise
query validation fails in Calcite).

I'm wondering if I'm missing some built-in functionality which would make
it possible to have a dynamic table function/macro that can also be
manipulated via custom planner rules.

Options (which may or may not exist) that I can think of are:
* something that would/could visit all macro parameters ahead of time and
resolve things like CURRENT_TIMESTAMP() to a literal, before further query
validation occurs
* register rules somewhere outside of RelOptNode.register (e.g. when the
schema is first created)

Are there any currently-working options in Calcite that can help me do what
I'm trying to do? And if there aren't and I would add such a thing to
Calcite, are there any suggestions as to what the most appropriate approach
would be (either one of the two options I listed above, or something else)?

Thanks,

Gabriel