Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Xiening Dai
Thanks for the sharing. I like the way you model this problem, Jinfeng.

There’s one minor issue with your example. Let say if R and S doesn’t have 
sorting properties at all. In your case, we would end up adding enforcers for 
LHS and RHS to get collation (a, b, c). Then we would need another enforcer to 
get collation (b, c). This is a sub optimal plan as we could have use (b, c, a) 
for join.

I think in step #2, the join operator would need to take the agg trait 
requirement into account. Then it would have two options -

1) require *exact/super* match of  (b, c, a) or (c, b, a); this is to guarantee 
the join output would deliver the collation agg needs.
2) require permutation match of (a, b, c); in such case, an enforcer might be 
needed for aggregation.

Eventually the cost model decides who is the winner.

There’s a fundamental difference between your model and Haisheng’s proposal. In 
Haisheng’s case, a rel node not only looks at its parent’s requirement, but 
also tries to get the potential traits its input could deliver. It would try to 
align them to eliminate unnecessary alternatives.

In above example, assuming R is (b, c, a) and S is (a, b, c), to implement 
option 1), we would generate two alternatives -

MergeJoin (b, c, a)
TableScan R
Sort(b, c, a)
TableScan S

MergeJoin(c, b, a)
Sort(c, b, a)
TableScan R
Sort(c, b, a)
TableScan S

But if we look at the input traits and has the insight that R already delivers 
(b, c, a), we could decide to require (b, c, a) only and avoid generating the 
2nd plan, which is definitely worse, and reduce the search space. 


> On Oct 18, 2019, at 4:57 PM, Jinfeng Ni  wrote:
> 
> A little bit of history.  In Drill,  when we first implemented
> Distribution trait's definition,  we allows both exact match and
> partial match in satisfy() method. This works fine for single-input
> operator such aggregation, however it leads to incorrect plan for join
> query, i.e LHS shuffle with (a, b),  RHS shuffle with (a) .  At that
> time, we removed partial match, and use exact match only. Yet this
> changes leads to unnecessary additional exchange.  To mitigate this
> problem, in join physical operator, for a join key (a, b, c),  we
> enumerate different distribution requests, yet this lead to more space
> to explore and significantly increase planning time (which is probably
> what Haisheng also experienced).  When I look back, I feel probably
> what we miss is the "coordination" step in the join operator, because
> if we relax the requirement of satisfy(), for multi-input operators,
> we have to enforce some "coordination", to make sure multiple input's
> trait could work together properly.
> 
> 
> 
> On Fri, Oct 18, 2019 at 4:38 PM Jinfeng Ni  wrote:
>> 
>> This is an interesting topic. Thanks for bringing up this issue.
>> 
>> My understanding of Volcano planner is it works in a top-down search
>> mode (the parent asks for certain trait of its child), while the trait
>> propagates in a bottom-up way, as Stamatis explained.
>> 
>> IMHO, the issue comes down to the definition of RelTrait, how to
>> determine if a trait A could satisfy a request asking for trait B,
>> that is, how RelTrait.satisfies() method is implemented.
>> 
>> Let's first clarify different situations, using collation as example.
>> 1) The collation is requested by query's outmost ORDER BY clause.
>>   - The generated plan has to have "exact match", i.e same collation
>> (same column sequence), or "super match" .
>> exact match:   (a, b)  satisfy  (a, b)
>> super match:   (a, b, c)  satisfy (a, b)
>> 
>> 2) The collation is requested by operand with single input, such as
>> sort-based Aggregation.
>>   - In such case, a "permutation match" is sufficient.
>> For instance,  for Aggregation (b,c),  input with collation (c, b)
>> could satisfy the requirement.
>> permutation match:  (b, c) satisfy (c, b). (c, b) satisfy (c, b)
>> permutation match:  (b, c, a) satisfy (c, b). (c, b, a) satisfy (c, b)
>> 
>> 3) The collation is requested by operand with >= 2 inputs, such as
>> sort-based MergeJoin.
>>  - A permutation match is sufficient for each input
>>  - MergeJoin has to do coordination, after input's trait propagates
>> upwards. In other words,  ensure both inputs's permutation match are
>> actually same sequence. Otherwise,  enforcer could be inserted upon
>> each input, and the planner generates two plans and let the cost
>> decide.
>> 
>> For the first case, this is how today's RelCollation's satisfy()
>> method is implemented.
>> 
>> For the second / third cases, use Haisheng's example,
>> 
>> SELECT DISTINCT c, b FROM
>>  ( SELECT R.c c, S.b b FROM R, S
>>WHERE R.a=S.a and R.b=S.b and R.c=S.c) t;
>> 
>> Aggregate . (c, b)
>>+--- MergeJoin . (a, b, c)
>>|--- TableScan on R
>>+--- TableScan on S
>> 
>> Here is the steps that might take place in the planner:
>> 
>> 

Re: Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Jinfeng Ni
A little bit of history.  In Drill,  when we first implemented
Distribution trait's definition,  we allows both exact match and
partial match in satisfy() method. This works fine for single-input
operator such aggregation, however it leads to incorrect plan for join
query, i.e LHS shuffle with (a, b),  RHS shuffle with (a) .  At that
time, we removed partial match, and use exact match only. Yet this
changes leads to unnecessary additional exchange.  To mitigate this
problem, in join physical operator, for a join key (a, b, c),  we
enumerate different distribution requests, yet this lead to more space
to explore and significantly increase planning time (which is probably
what Haisheng also experienced).  When I look back, I feel probably
what we miss is the "coordination" step in the join operator, because
if we relax the requirement of satisfy(), for multi-input operators,
we have to enforce some "coordination", to make sure multiple input's
trait could work together properly.



On Fri, Oct 18, 2019 at 4:38 PM Jinfeng Ni  wrote:
>
> This is an interesting topic. Thanks for bringing up this issue.
>
> My understanding of Volcano planner is it works in a top-down search
> mode (the parent asks for certain trait of its child), while the trait
> propagates in a bottom-up way, as Stamatis explained.
>
> IMHO, the issue comes down to the definition of RelTrait, how to
> determine if a trait A could satisfy a request asking for trait B,
> that is, how RelTrait.satisfies() method is implemented.
>
> Let's first clarify different situations, using collation as example.
> 1) The collation is requested by query's outmost ORDER BY clause.
>- The generated plan has to have "exact match", i.e same collation
> (same column sequence), or "super match" .
> exact match:   (a, b)  satisfy  (a, b)
> super match:   (a, b, c)  satisfy (a, b)
>
> 2) The collation is requested by operand with single input, such as
> sort-based Aggregation.
>- In such case, a "permutation match" is sufficient.
> For instance,  for Aggregation (b,c),  input with collation (c, b)
> could satisfy the requirement.
> permutation match:  (b, c) satisfy (c, b). (c, b) satisfy (c, b)
> permutation match:  (b, c, a) satisfy (c, b). (c, b, a) satisfy (c, b)
>
> 3) The collation is requested by operand with >= 2 inputs, such as
> sort-based MergeJoin.
>   - A permutation match is sufficient for each input
>   - MergeJoin has to do coordination, after input's trait propagates
> upwards. In other words,  ensure both inputs's permutation match are
> actually same sequence. Otherwise,  enforcer could be inserted upon
> each input, and the planner generates two plans and let the cost
> decide.
>
> For the first case, this is how today's RelCollation's satisfy()
> method is implemented.
>
> For the second / third cases, use Haisheng's example,
>
> SELECT DISTINCT c, b FROM
>   ( SELECT R.c c, S.b b FROM R, S
> WHERE R.a=S.a and R.b=S.b and R.c=S.c) t;
>
> Aggregate . (c, b)
> +--- MergeJoin . (a, b, c)
> |--- TableScan on R
> +--- TableScan on S
>
> Here is the steps that might take place in the planner:
>
> 1) Aggregate request permutation match collation (c, b)
> 2) MergeJoin request a permutation match of (a, b,c) on both it's input
> 3) R respond with collation (c, b, a), which satisfy MergeJoin's LHS 
> requirement
> 4) S respond with collation (b, c, a), which satisfy MergeJoins' RHS 
> requirement
> 5) MergeJoin do a coordination o LHS, RHS, and generate two possible plans
>MJ1:   Insert a sort of (c, b, a) on RHS.  This MJ operator now has
> collation of (c, b, a)
>MJ2:   Insert a sort of (b, c, a) on LHS.  This MJ operator now has
> collation of (b, c, a)
> 6) MJ1 and MJ2 could both satisfy  permutation match request in step
> 1, leading to two possible plans:
>   Agg1:  with input of MJ1
>   Agg2:  with input of MJ2
> 7) planner chooses a best plan based on cost of Agg1 and Agg2.
>
> I should point that the enforcer sort inserted in step 5 could help
> remove redundant sort in its input, if the input's collation is
> obtained from sort, by invoking Calcite's SortRemove Rule.
>
> The above only considers the column sequence. The DESC/ASC, NULL
> FIRST/LAST will add more complexity, but we probably use similar idea.
>
> In summary,  we need :
>   1) redefine collation trait's satisfy() policy,  exact match, super
> match, permutation match,
>   2) different physical operator applies different trait matching
> policy, depending on operator's # of inputs, and algorithm
> implementation.
>
>
>
>
>
> On Fri, Oct 18, 2019 at 2:51 PM Haisheng Yuan  wrote:
> >
> > Hi Stamatis,
> >
> > Thanks for your comment. I think my example didn't make it clear.
> >
> > When a logical operator is created, it doesn't have any physical,
> > propertyand it shouldn't have. When a physical operator is created,
> > e.g. in Enumerable convention, it only creates an intuitive traitset
> > with it, and requests 

Re: Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Jinfeng Ni
This is an interesting topic. Thanks for bringing up this issue.

My understanding of Volcano planner is it works in a top-down search
mode (the parent asks for certain trait of its child), while the trait
propagates in a bottom-up way, as Stamatis explained.

IMHO, the issue comes down to the definition of RelTrait, how to
determine if a trait A could satisfy a request asking for trait B,
that is, how RelTrait.satisfies() method is implemented.

Let's first clarify different situations, using collation as example.
1) The collation is requested by query's outmost ORDER BY clause.
   - The generated plan has to have "exact match", i.e same collation
(same column sequence), or "super match" .
exact match:   (a, b)  satisfy  (a, b)
super match:   (a, b, c)  satisfy (a, b)

2) The collation is requested by operand with single input, such as
sort-based Aggregation.
   - In such case, a "permutation match" is sufficient.
For instance,  for Aggregation (b,c),  input with collation (c, b)
could satisfy the requirement.
permutation match:  (b, c) satisfy (c, b). (c, b) satisfy (c, b)
permutation match:  (b, c, a) satisfy (c, b). (c, b, a) satisfy (c, b)

3) The collation is requested by operand with >= 2 inputs, such as
sort-based MergeJoin.
  - A permutation match is sufficient for each input
  - MergeJoin has to do coordination, after input's trait propagates
upwards. In other words,  ensure both inputs's permutation match are
actually same sequence. Otherwise,  enforcer could be inserted upon
each input, and the planner generates two plans and let the cost
decide.

For the first case, this is how today's RelCollation's satisfy()
method is implemented.

For the second / third cases, use Haisheng's example,

SELECT DISTINCT c, b FROM
  ( SELECT R.c c, S.b b FROM R, S
WHERE R.a=S.a and R.b=S.b and R.c=S.c) t;

Aggregate . (c, b)
+--- MergeJoin . (a, b, c)
|--- TableScan on R
+--- TableScan on S

Here is the steps that might take place in the planner:

1) Aggregate request permutation match collation (c, b)
2) MergeJoin request a permutation match of (a, b,c) on both it's input
3) R respond with collation (c, b, a), which satisfy MergeJoin's LHS requirement
4) S respond with collation (b, c, a), which satisfy MergeJoins' RHS requirement
5) MergeJoin do a coordination o LHS, RHS, and generate two possible plans
   MJ1:   Insert a sort of (c, b, a) on RHS.  This MJ operator now has
collation of (c, b, a)
   MJ2:   Insert a sort of (b, c, a) on LHS.  This MJ operator now has
collation of (b, c, a)
6) MJ1 and MJ2 could both satisfy  permutation match request in step
1, leading to two possible plans:
  Agg1:  with input of MJ1
  Agg2:  with input of MJ2
7) planner chooses a best plan based on cost of Agg1 and Agg2.

I should point that the enforcer sort inserted in step 5 could help
remove redundant sort in its input, if the input's collation is
obtained from sort, by invoking Calcite's SortRemove Rule.

The above only considers the column sequence. The DESC/ASC, NULL
FIRST/LAST will add more complexity, but we probably use similar idea.

In summary,  we need :
  1) redefine collation trait's satisfy() policy,  exact match, super
match, permutation match,
  2) different physical operator applies different trait matching
policy, depending on operator's # of inputs, and algorithm
implementation.





On Fri, Oct 18, 2019 at 2:51 PM Haisheng Yuan  wrote:
>
> Hi Stamatis,
>
> Thanks for your comment. I think my example didn't make it clear.
>
> When a logical operator is created, it doesn't have any physical,
> propertyand it shouldn't have. When a physical operator is created,
> e.g. in Enumerable convention, it only creates an intuitive traitset
> with it, and requests it children the corresponding ones.
>
> For operators such as Join, Aggregate, Window, which may deliver
> multiple different traitsets, when the parent operator is created and
> request its traitset, it might be good to know what are the poosible
> traitset that the child operator can deliver. e.g.
>
> SELECT DISTINCT c, b FROM
>   ( SELECT R.c c, S.b b FROM R, S
> WHERE R.a=S.a and R.b=S.b and R.c=S.c) t;
>
> Suppose R is ordered by (c, b, a), and S is ordered by (b, c, a).
> Here is the logical plan:
> Aggregate
> +--- InnerJoin
> |--- TableScan on R
> +--- TableScan on S
>
> When we create a physical merge join for the inner join, it may just
> have collation sorted on a,b,c. Then the aggreate on top of join will
> request another sort on c,b, thus we miss the best plan. What we
> can do is requesting all the order combinations, which is n!, like
> how the Values operator does. But that is too much.
>
> If we can provide an approach that can minimize the possiple traitset
> that the child operator may deliver, we can reduce the chance of missing
> good plans. For the above query, the Aggregate operator can derive
> possible traitsets that its child 

Re: Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Haisheng Yuan
Hi Stamatis,

Thanks for your comment. I think my example didn't make it clear.

When a logical operator is created, it doesn't have any physical,
propertyand it shouldn't have. When a physical operator is created,
e.g. in Enumerable convention, it only creates an intuitive traitset
with it, and requests it children the corresponding ones.

For operators such as Join, Aggregate, Window, which may deliver 
multiple different traitsets, when the parent operator is created and
request its traitset, it might be good to know what are the poosible
traitset that the child operator can deliver. e.g.

SELECT DISTINCT c, b FROM
  ( SELECT R.c c, S.b b FROM R, S 
WHERE R.a=S.a and R.b=S.b and R.c=S.c) t;

Suppose R is ordered by (c, b, a), and S is ordered by (b, c, a).
Here is the logical plan:
Aggregate
+--- InnerJoin
|--- TableScan on R
+--- TableScan on S

When we create a physical merge join for the inner join, it may just
have collation sorted on a,b,c. Then the aggreate on top of join will
request another sort on c,b, thus we miss the best plan. What we
can do is requesting all the order combinations, which is n!, like
how the Values operator does. But that is too much.

If we can provide an approach that can minimize the possiple traitset
that the child operator may deliver, we can reduce the chance of missing
good plans. For the above query, the Aggregate operator can derive
possible traitsets that its child operator join can deliver, in which case,
the possiple traitsets of join is 
1. collation on (a,b,c) based on join condition, 
2. collation on (c,b,a) based on left child,
3. collation on (b,c,a) based on right child 
So we can request Aggregate sorted by (c,b) and Join sorted by (c,b,a).
The number of traiset requests and plan alternatives can be reduced.
The DerivedTraitSets can be used to derive the possible traitsets from
Join, and pass through Project, Filter etc...

This is just an example of non-distributed system, for distributed system,
it can save much more by considering the possible distribution delivered
by child operators.

One thing that concerns me is it highly relies on the traiset system of the
underlying physical system. Like Enumerable doesn't consider distribution,
because it is single-node system, but Hive/Flink are distributed system.
- Haisheng

--
发件人:Stamatis Zampetakis
日 期:2019年10月18日 14:53:41
收件人:
主 题:Re: [DISCUSS] On-demand traitset request

Hi Haisheng,

This is an interesting topic but somehow in my mind I thought that this
mechanism is already in place.

When an operator (logical or physical) is created its traitset is
determined in bottom-up fashion using the create
static factory method present in almost all operators. In my mind this is
in some sense the applicability function
mentioned in [1].

Now during optimization we proceed in top-down manner and we request
certain traitsets from the operators.
If it happens and they contain already the requested traits nothing needs
to be done.

In your example when we are about to create the sort-merge join we can
check what traitsets are present in the inputs
and if possible request those. Can you elaborate a bit more why do we need
a new type of metadata?

Anyway if we cannot do it at the moment it makes sense to complete the
missing bits since what you are describing
was already mentioned in the original design of the Volcano optimizer [1].

"If a move to be pursued is the exploration of a normal query processing
algorithm such as merge-join, its cost is calculated by the algorithm's
cost function. The algorithm's applicability function determines the
physical properly vectors for the algorithms inputs, and their costs and
optimal plans are found by invoking FindBestPlan for the inputs. For some
binary operators, the actual physical properties of the inputs are not as
important as the consistency of physical properties among the inputs. For
example, for a sort-based implementation of intersection, i.e., an
algorithm very similar to merge-join, any sort order of the two inputs will
suffice as long as the two inputs are sorted in the same way. Similarly,
for a parallel join, any partitioning of join inputs across multiple
processing nodes is acceptable if both inputs are partitioned using
Compatible partitioning rules. For these cases, the search engine permits
the optimizer implementor to specify a number of physical property vectors
to be tried. For example, for the intersection of two inputs R and S with
attributes A, B, and C where R is sorted on (A,B,C) and S is sorted on
(B,A,C), both these sort orders can be specified by the optimizer
implementor and will be optimized by the generated optimizer, while other
possible sort orders, e.g., (C,B,A), will be ignored. " [1]

Best,
Stamatis

[1]
https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf

On Fri, Oct 18, 2019 at 4:56 AM Haisheng Yuan 

Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Julian Hyde
To clarify. The purpose of this API would be to give the search engine
more high-level as to the goals it should focus on. The performance
issues described in https://issues.apache.org/jira/browse/CALCITE-2970
seem to be due to the planner "trying everything", and the solution
might be to add a bit more high-level structure.

On Fri, Oct 18, 2019 at 11:07 AM Julian Hyde  wrote:
>
> Excellent, very important discussion. This has been a major missing
> feature for a long time. Let's be sure to get to a conclusion and
> implement something.
>
> From the Volcano paper:
>
>   "the search engine permits the optimizer implementor to specify
>   a number of physical property vectors to be tried"
>
> How would we achieve this? Would we add an API to RelOptRule? If so,
> what would that API look like?
>
> On Thu, Oct 17, 2019 at 11:54 PM Stamatis Zampetakis  
> wrote:
> >
> > Hi Haisheng,
> >
> > This is an interesting topic but somehow in my mind I thought that this
> > mechanism is already in place.
> >
> > When an operator (logical or physical) is created its traitset is
> > determined in bottom-up fashion using the create
> > static factory method present in almost all operators. In my mind this is
> > in some sense the applicability function
> > mentioned in [1].
> >
> > Now during optimization we proceed in top-down manner and we request
> > certain traitsets from the operators.
> > If it happens and they contain already the requested traits nothing needs
> > to be done.
> >
> > In your example when we are about to create the sort-merge join we can
> > check what traitsets are present in the inputs
> > and if possible request those. Can you elaborate a bit more why do we need
> > a new type of metadata?
> >
> > Anyway if we cannot do it at the moment it makes sense to complete the
> > missing bits since what you are describing
> > was already mentioned in the original design of the Volcano optimizer [1].
> >
> > "If a move to be pursued is the exploration of a normal query processing
> > algorithm such as merge-join, its cost is calculated by the algorithm's
> > cost function. The algorithm's applicability function determines the
> > physical properly vectors for the algorithms inputs, and their costs and
> > optimal plans are found by invoking FindBestPlan for the inputs. For some
> > binary operators, the actual physical properties of the inputs are not as
> > important as the consistency of physical properties among the inputs. For
> > example, for a sort-based implementation of intersection, i.e., an
> > algorithm very similar to merge-join, any sort order of the two inputs will
> > suffice as long as the two inputs are sorted in the same way. Similarly,
> > for a parallel join, any partitioning of join inputs across multiple
> > processing nodes is acceptable if both inputs are partitioned using
> > Compatible partitioning rules. For these cases, the search engine permits
> > the optimizer implementor to specify a number of physical property vectors
> > to be tried. For example, for the intersection of two inputs R and S with
> > attributes A, B, and C where R is sorted on (A,B,C) and S is sorted on
> > (B,A,C), both these sort orders can be specified by the optimizer
> > implementor and will be optimized by the generated optimizer, while other
> > possible sort orders, e.g., (C,B,A), will be ignored. " [1]
> >
> > Best,
> > Stamatis
> >
> > [1]
> > https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf
> >
> > On Fri, Oct 18, 2019 at 4:56 AM Haisheng Yuan 
> > wrote:
> >
> > > TL;DR
> > > Both top-down physical TraitSet request and bottom-up TraitSet
> > > derivation have their strongth and weakness, we propose
> > > on-demand TraitSet request to combine the above two, to reduce
> > > the number of plan alternatives that are genereated, especially
> > > in distributed system.
> > >
> > > e.g.
> > > select * from foo join bar on f1=b1 and f2=b2 and f3=b3;
> > >
> > > In non-distributed system, we can generate a sort merge join,
> > > requesting foo sorted by f1,f2,f3 and bar sorted by b1,b2,b3.
> > > But if foo happens to be sorted by f3,f2,f1, we may miss the
> > > chance of making use of the delivered ordering of foo. Because
> > > if we require bar to be sorted by b3,b2,b1, we don't need to
> > > sort on foo anymore. There are so many choices, n!, not even
> > > considering asc/desc and null direction. We can't request all
> > > the possible traitsets in top-down way, and can't derive all the
> > > possible traitsets in bottom-up way either.
> > >
> > > We propose on-demand traitset request by adding a new type
> > > of metadata DerivedTraitSets into the built-in metadata system.
> > >
> > > List deriveTraitSets(RelNode, RelMetadataQuery)
> > >
> > > In this metadata, every operator returns several possbile traitsets
> > > that may be derived from this operator.
> > >
> > > Using above query as an example, the tablescan on foo should
> > > return traiset with collation on f3, 

Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Julian Hyde
Excellent, very important discussion. This has been a major missing
feature for a long time. Let's be sure to get to a conclusion and
implement something.

>From the Volcano paper:

  "the search engine permits the optimizer implementor to specify
  a number of physical property vectors to be tried"

How would we achieve this? Would we add an API to RelOptRule? If so,
what would that API look like?

On Thu, Oct 17, 2019 at 11:54 PM Stamatis Zampetakis  wrote:
>
> Hi Haisheng,
>
> This is an interesting topic but somehow in my mind I thought that this
> mechanism is already in place.
>
> When an operator (logical or physical) is created its traitset is
> determined in bottom-up fashion using the create
> static factory method present in almost all operators. In my mind this is
> in some sense the applicability function
> mentioned in [1].
>
> Now during optimization we proceed in top-down manner and we request
> certain traitsets from the operators.
> If it happens and they contain already the requested traits nothing needs
> to be done.
>
> In your example when we are about to create the sort-merge join we can
> check what traitsets are present in the inputs
> and if possible request those. Can you elaborate a bit more why do we need
> a new type of metadata?
>
> Anyway if we cannot do it at the moment it makes sense to complete the
> missing bits since what you are describing
> was already mentioned in the original design of the Volcano optimizer [1].
>
> "If a move to be pursued is the exploration of a normal query processing
> algorithm such as merge-join, its cost is calculated by the algorithm's
> cost function. The algorithm's applicability function determines the
> physical properly vectors for the algorithms inputs, and their costs and
> optimal plans are found by invoking FindBestPlan for the inputs. For some
> binary operators, the actual physical properties of the inputs are not as
> important as the consistency of physical properties among the inputs. For
> example, for a sort-based implementation of intersection, i.e., an
> algorithm very similar to merge-join, any sort order of the two inputs will
> suffice as long as the two inputs are sorted in the same way. Similarly,
> for a parallel join, any partitioning of join inputs across multiple
> processing nodes is acceptable if both inputs are partitioned using
> Compatible partitioning rules. For these cases, the search engine permits
> the optimizer implementor to specify a number of physical property vectors
> to be tried. For example, for the intersection of two inputs R and S with
> attributes A, B, and C where R is sorted on (A,B,C) and S is sorted on
> (B,A,C), both these sort orders can be specified by the optimizer
> implementor and will be optimized by the generated optimizer, while other
> possible sort orders, e.g., (C,B,A), will be ignored. " [1]
>
> Best,
> Stamatis
>
> [1]
> https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf
>
> On Fri, Oct 18, 2019 at 4:56 AM Haisheng Yuan 
> wrote:
>
> > TL;DR
> > Both top-down physical TraitSet request and bottom-up TraitSet
> > derivation have their strongth and weakness, we propose
> > on-demand TraitSet request to combine the above two, to reduce
> > the number of plan alternatives that are genereated, especially
> > in distributed system.
> >
> > e.g.
> > select * from foo join bar on f1=b1 and f2=b2 and f3=b3;
> >
> > In non-distributed system, we can generate a sort merge join,
> > requesting foo sorted by f1,f2,f3 and bar sorted by b1,b2,b3.
> > But if foo happens to be sorted by f3,f2,f1, we may miss the
> > chance of making use of the delivered ordering of foo. Because
> > if we require bar to be sorted by b3,b2,b1, we don't need to
> > sort on foo anymore. There are so many choices, n!, not even
> > considering asc/desc and null direction. We can't request all
> > the possible traitsets in top-down way, and can't derive all the
> > possible traitsets in bottom-up way either.
> >
> > We propose on-demand traitset request by adding a new type
> > of metadata DerivedTraitSets into the built-in metadata system.
> >
> > List deriveTraitSets(RelNode, RelMetadataQuery)
> >
> > In this metadata, every operator returns several possbile traitsets
> > that may be derived from this operator.
> >
> > Using above query as an example, the tablescan on foo should
> > return traiset with collation on f3, f2, f1.
> >
> > In physical implementation rules, e.g. the SortMergeJoinRule,
> > it gets possible traitsets from both child operators, uses the join
> > keys to eliminate useless traitsets, leaves out usefull traitsets,
> > and requests corresponding traitset on the other child.
> >
> > This relies on the feature of AbstractConverter, which is turned
> > off by default, due to performance issue [1].
> >
> > Thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/CALCITE-2970
> >
> > Haisheng
> >
> >


Re: Apache Calcite meetup group

2019-10-18 Thread Jesus Camacho Rodriguez
It seems someone else (Denis Magda) paid the fees in the meantime.

-Jesús

On Fri, Oct 18, 2019 at 1:32 AM Danny Chan  wrote:

> Thanks Jesús for taking over this !
>
> Best,
> Danny Chan
> 在 2019年10月18日 +0800 PM2:00,dev@calcite.apache.org,写道:
> >
> > Jesús
>


[jira] [Created] (CALCITE-3430) JDBC adapter generates extra alias for VALUES when used in join

2019-10-18 Thread Jess Balint (Jira)
Jess Balint created CALCITE-3430:


 Summary: JDBC adapter generates extra alias for VALUES when used 
in join
 Key: CALCITE-3430
 URL: https://issues.apache.org/jira/browse/CALCITE-3430
 Project: Calcite
  Issue Type: Bug
  Components: jdbc-adapter
Affects Versions: 1.21.0
Reporter: Jess Balint


For Postgres and other DBs which support this, the generated SQL is {{(values 
(1, 'a'), (2, 'bb')) as t(x, y)}}. When it's used in a join the SqlImplementor 
adds a unique alias and winds up rendering as {{(values (1, 'a'), (2, 'bb')) as 
t(x, y) as t0}}. Perhaps it just needs wrapped in parens, of we could create a 
unique alias in RelToSqlConverter and avoid generating another one in the 
{{result()}} method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3429) Refinement for implementation of sql type factory

2019-10-18 Thread Wang Yanlin (Jira)
Wang Yanlin created CALCITE-3429:


 Summary: Refinement for implementation of sql type factory
 Key: CALCITE-3429
 URL: https://issues.apache.org/jira/browse/CALCITE-3429
 Project: Calcite
  Issue Type: Improvement
Reporter: Wang Yanlin


1. The SqlTypeName of data type created with subclass of Map should be 
SqlTypename.Map, currently not.
2. The SqlTypeName of data type created with subclass of List or Array should 
be SqlTypename.Array, currently not.
3. SqlTypeName.Map is not a basic type, should not be used to createSqlType, 
just like SqlTypeName.Array



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CassandraAdapter (Add Type) and WHERE statement.

2019-10-18 Thread Yanna elina
I think it's me who does not have to understand all the subtlety.
I thought that STREAM works more like a   in-memory- relational database
but i missed something thank for your help :)

Le jeu. 17 oct. 2019 à 15:53, Michael Mior  a écrit :

> Perhaps I'm missing something, but I don't see why this would be any
> more efficient. Selecting all data is also not an efficient operation
> in Cassandra. Using ALLOW FILTERING will likely be more efficient
> since it's basically the same as doing a table scan, but it avoids
> returning data which would later be filtered by Calcite anyway.
> --
> Michael Mior
> mm...@apache.org
>
> Le jeu. 17 oct. 2019 à 09:13, Yanna elina  a
> écrit :
> >
> > Thank for reply Michael.
> >
> > yes i understood  this on the documentation for example with "WHERE"
> > statement   calcite i  force the . "ALLOW FILTERING; "
> > and this can be expensive.
> >
> >  I think there may be an interesting approach using STREAM.
> >
> > for example maintain a regular update between a cassandra TABLE and a
> > STREAM TABLE.
> >
> > CASSANDRA_TABLE_A .(SELECT * FROM TABLE_A) > STREAM_TABLE_A .
> > SELECT STREAM * FROM STREAM_TABLE_A WHERE username = 'JmuhsAaMdw'
> >
> > i guess it will be more efficient to directly make the WHERE from the
> > STREAM than the cassandra_adapter  using "allow filtering"
> > a synchronization strategy can be set up between the cassandra table and
> > the STREAM table
> > what is your opinion about this approach ?
> > Thanks !
> > Yana
> >
> >
> > Le mer. 16 oct. 2019 à 17:08, Michael Mior  a écrit :
> >
> > > You're right that there are several types which are not supported by
> > > the Cassandra adapter. We would happily accept pull requests to add
> > > support for new types.
> > >
> > > You're also correct that Cassandra cannot efficiently execute queries
> > > which do not specify the partition key. Calcite will make those
> > > queries more efficient, but it can make it easier to execute queries
> > > that CQL does not directly support. Ultimately data is still stored
> > > based on the partition key, so if your query does not specify a
> > > partition key, Calcite will still need to issue an expensive
> > > cross-partition query to Cassandra.
> > > --
> > > Michael Mior
> > > mm...@apache.org
> > >
> > > Le mer. 16 oct. 2019 à 07:57, Yanna elina 
> a
> > > écrit :
> > > >
> > > > Hi guys ,
> > > >
> > > > I study Calcite the benefits that a Cassandra-Calcite Adapter can
> bring ,
> > > > as for example brings the possibility of join.
> > > >
> > > > the problem type defined into CassandraSchema.getRelDataType(..) is
> very
> > > > limited
> > > > some important type are missing  boolean / array ect...
> > > >
> > > > I thought inherited from the class CassandraSchema for Override  this
> > > > method and add more type but this method is used inside
> CassandraTable
> > > too.
> > > >
> > > > i would like to avoid  to re-write fully this adapter  :)
> > > >
> > > > do you have suggestions?
> > > >
> > > > My second question  is : Cassandra is not optimized to have WHERE on
> key
> > > > not defined on cluster/partition key. I was wondering if calcite
> could
> > > play
> > > > a role without this mechanism to improve performance
> > > >
> > > >
> > > > Thank !
> > > >
> > > > Yanna
> > >
>


[jira] [Created] (CALCITE-3428) Refine RelMdColumnUniqueness for Filter by considering constant columns

2019-10-18 Thread jin xing (Jira)
jin xing created CALCITE-3428:
-

 Summary: Refine RelMdColumnUniqueness for Filter by considering 
constant columns
 Key: CALCITE-3428
 URL: https://issues.apache.org/jira/browse/CALCITE-3428
 Project: Calcite
  Issue Type: Improvement
Reporter: jin xing
Assignee: jin xing






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Extension of Metadata Query

2019-10-18 Thread XING JIN
+1 on Danny's comment.
If we use MedataFactory to customize and use RelMetadataQuery for
convenience, that will make user confused.

Danny Chan  于2019年10月18日周五 下午12:33写道:

> That is the point, we should supply a way to extend the RelMetadataQuery
> conveniently for Calcite, because in most of the RelOptRules, user would
> use the code like:
>
> RelOptRuleCall.getMetadataQuery
>
> To get a RMQ instead of using AbstractRelNode.metadata() to fetch a
> MedataFactory.
>
> We should at lest unity the metadata query entrance/interfaces, or it
> would confuse a lot.
>
> Best,
> Danny Chan
> 在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> > At least in our project (Apache Ignite) we use
> AbstractRelNode.metadata().
> >
> > But it so because there is no way to put our metadata type into
> > RelMetadataQuery without changes in Calcite.
> >
> > Regards,
> > Igor
> >
> > чт, 17 окт. 2019 г., 19:16 Xiening Dai :
> >
> > > MetadataFactory is still useful. It provides a way to access Metadata
> > > directly. If someone creates a new type of Metadata class, it can be
> > > accessed through AbstractRelNode.metadata(). This way you don’t need to
> > > update RelMetadataQuery interface to include the getter for this new
> meta.
> > > Although I don’t see this pattern being used often, but I do think it
> is
> > > still useful and shouldn’t be removed.
> > >
> > >
> > > For your second point, I think you would still need a way to keep
> > > RelMetadataQuery object during a rule call. If you choose to create new
> > > instance, you will have to pass it around while applying the rule. That
> > > actually complicates things a lot.
> > >
> > >
> > > > On Oct 17, 2019, at 12:49 AM, XING JIN 
> wrote:
> > > >
> > > > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> > > should
> > > > we keep/maintain both of them ? shall we just deprecate
> MetadataFactory.
> > > I
> > > > see MetadataFactory is rarely used in current code. Also I
> > > > think MetadataFactory is not good place to offering customized
> metadata,
> > > > which will make user confused for the difference between
> RelMetadataQuery
> > > > and MetadataFactory.
> > > >
> > > > > Customized RelMetadataQuery with code generated meta handler for
> > > > customized metadata, also can provide convenient way to get metadata.
> > > > It makes sense for me.
> > > >
> > > > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall,
> shall
> > > we
> > > > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > > > metadata but without a RelOptCall, he/she will need to create a new
> > > > instance of RelMetadataQuery.
> > > >
> > > > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> > > >
> > > > > I have seen both patterns in current code base. In most places, for
> > > > > example SubQueryRemoveRule, AggregateUnionTrasposeRule
> > > > > SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is
> used.
> > > And
> > > > > there are a few other places where new RelMetadataQuery instance is
> > > > > created, which Haisheng attempts to fix.
> > > > >
> > > > > Currently RelOptCluster.invalidateMetadataQuery() is called at the
> end
> > > of
> > > > > RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery
> is
> > > > > guaranteed to be within a RelOptCall. I think Haisheng’s fix is
> safe.
> > > > >
> > > > >
> > > > > > On Oct 16, 2019, at 1:53 AM, Danny Chan 
> wrote:
> > > > > >
> > > > > > This is the reason I was struggling for the discussion.
> > > > > >
> > > > > > Best,
> > > > > > Danny Chan
> > > > > > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> > > > > > >
> > > > > > > RelMetadataQuery
> > > > >
> > > > >
> > >
> > >
>


Re: Apache Calcite meetup group

2019-10-18 Thread Danny Chan
Thanks Jesús for taking over this !

Best,
Danny Chan
在 2019年10月18日 +0800 PM2:00,dev@calcite.apache.org,写道:
>
> Jesús


Re: [DISCUSS] Support Sql Hint for Calcite

2019-10-18 Thread Danny Chan
Thanks Julian, for current patch, I choose 1 because it can applied both to Hep 
and Volcano. I make these latest changes:

• Add a new interface RelOptRuleCall.transformTo(RelNode, Map, BiFunction) to 
make the hints copy strategy overridable
• Cache the hint strategies into RelOptCluster so that user can query the 
strategies during rule planning


Another reason I didn’t choose 2 is that a RelNode’s parent node may also be 
derived from a Rule matching, so I have to lookup recursively to find the real 
original node for the hints.

Best,
Danny Chan
在 2019年10月18日 +0800 AM4:55,Julian Hyde ,写道:
> I wonder whether it is possible to add some kind of “action handler” to the 
> planner engine, called, for example, when a rule has fired and is registering 
> the RelNode created by the rule. People can write their own action handlers 
> to copy hints around. Since the action handlers are the user’s code, they can 
> iterate faster to find a hint-propagation strategy that works in practice.
>
> Another idea is to use VolcanoPlanner.Provenance[1]. A RelNode can find its 
> ancestor RelNodes, and the rules that fired to create it. So it can grab 
> hints from those ancestors. It does not need to copy those hints onto itself.
>
> Julian
>
> [1] 
> https://calcite.apache.org/apidocs/org/apache/calcite/plan/volcano/VolcanoPlanner.Provenance.html
>  
> 
>
> > On Oct 16, 2019, at 8:38 PM, Haisheng Yuan  wrote:
> >
> > Julian,
> > Your concern is very valid, and that is also our main concern.
> > I was thinking whether we can put hint into the MEMO group, so that both 
> > logical and physical expression in the same group can share the same hint, 
> > without copying the hint explicitly. But for newly generated expression 
> > that doesn't belong to the original group, we still need to copy hints. 
> > What's worse, in HepPlanner, there is no such concept, we may still need to 
> > copy hints explicity in planner rules, if we want to keep the hint, which 
> > is burdensome.
> >
> > - Haisheng
> >
> > --
> > 发件人:Danny Chan
> > 日 期:2019年10月16日 14:54:46
> > 收件人:
> > 主 题:Re: [DISCUSS] Support Sql Hint for Calcite
> >
> > Thanks for the clarification.
> >
> > I understand you worried. Yes, the effort/memory would be wasted or 
> > meaningless if hints are not used. This is just what a hint does, it is a 
> > “hint” and non-mandatory, but we should give the chance to let user see 
> > them, it is the use that decide if to use the hints and how to use them. 
> > For big queries I have no confidence to cover the corner cases. So can we 
> > mark this feature as experimental and used for simple queries(no 
> > decorrelation) first ?
> >
> > For “reversible”, during the implementation, I try to make the 
> > modifications non-invasive with the current codes. That is why I made all 
> > the interfaces about the hint into one class named RelWithHInt. Different 
> > with trait, I didn’t force users to pass in the hints in the RelNode 
> > constructor. I think if is not a bigwork if we want to remove the API.
> >
> > Best,
> > Danny Chan
> > 在 2019年10月16日 +0800 AM11:14,Julian Hyde ,写道:
> > > By “skeptical” I mean that I think we can come up with a mechanism to 
> > > copy hints when applying planner rules, but even when we have implemented 
> > > that mechanism there will be many cases where people want a hint and that 
> > > hint is not copied to the RelNode where it is needed, and many other 
> > > cases where we spend the effort/memory of copying the hint to a RelNode 
> > > and the hint is not used.
> > >
> > > By “reversible” I mean if we come up with an API that does not work, how 
> > > do we change or remove that API without people complaining?
> > >
> > > Julian
> > >
> > >
> > > > On Oct 15, 2019, at 7:11 PM, Danny Chan  wrote:
> > > >
> > > > Thanks Julian
> > > >
> > > > > I am skeptical that RelWithHint will work for large queries.
> > > >
> > > > For “skeptical” do you mean how to transfer the hints during rule 
> > > > planning ? I’m also not that confident yet.
> > > >
> > > > > How do we introduce it in a reversible way
> > > > Do you mean transform the RelWithHint back into the SqlHint ? I didn’t 
> > > > implement it in current patch, but I think we have the ability to do 
> > > > that because we have a inheritPath for each RelWithHint, we can collect 
> > > > all the hints together and merge them into the SqlHints, then propagate 
> > > > these SqlHints to the SqlNodes.
> > > >
> > > > > What are the other options?
> > > > Do you mean the way to transfer hints during planning ? I have no other 
> > > > options yet.
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2019年10月16日 +0800 AM8:03,dev@calcite.apache.org,写道:
> > > > >
> > > > > I am skeptical that RelWithHint will work for large queries.
> > >
> >
>


Re: Apply to be registered in JIRA as a contributor

2019-10-18 Thread Francis Chuang

I've added your account to the Contributor role in Jira.

Francis

On 18/10/2019 6:28 pm, Wang Yanlin wrote:

Hi, community,


Follow the direction on Calcite developing page, 
https://calcite.apache.org/develop/



If you are going to take on the issue right away assign it to yourself. To 
assign issues to yourself you have to be registered in JIRA as a contributor.
In order to do that, send an email to the developers list and provide your JIRA 
username.



I want to be registered in JIRA as a contributor so that I can assign issues to 
myself.
My JIRA username is:  yanlin-Lynn, and my full name is: Wang Yanlin


--

Best,
Wang Yanlin



Apply to be registered in JIRA as a contributor

2019-10-18 Thread Wang Yanlin
Hi, community,


Follow the direction on Calcite developing page, 
https://calcite.apache.org/develop/


> If you are going to take on the issue right away assign it to yourself. To 
> assign issues to yourself you have to be registered in JIRA as a contributor. 
> In order to do that, send an email to the developers list and provide your 
> JIRA username.


I want to be registered in JIRA as a contributor so that I can assign issues to 
myself.
My JIRA username is:  yanlin-Lynn, and my full name is: Wang Yanlin


--

Best,
Wang Yanlin

Re: [DISCUSS] On-demand traitset request

2019-10-18 Thread Stamatis Zampetakis
Hi Haisheng,

This is an interesting topic but somehow in my mind I thought that this
mechanism is already in place.

When an operator (logical or physical) is created its traitset is
determined in bottom-up fashion using the create
static factory method present in almost all operators. In my mind this is
in some sense the applicability function
mentioned in [1].

Now during optimization we proceed in top-down manner and we request
certain traitsets from the operators.
If it happens and they contain already the requested traits nothing needs
to be done.

In your example when we are about to create the sort-merge join we can
check what traitsets are present in the inputs
and if possible request those. Can you elaborate a bit more why do we need
a new type of metadata?

Anyway if we cannot do it at the moment it makes sense to complete the
missing bits since what you are describing
was already mentioned in the original design of the Volcano optimizer [1].

"If a move to be pursued is the exploration of a normal query processing
algorithm such as merge-join, its cost is calculated by the algorithm's
cost function. The algorithm's applicability function determines the
physical properly vectors for the algorithms inputs, and their costs and
optimal plans are found by invoking FindBestPlan for the inputs. For some
binary operators, the actual physical properties of the inputs are not as
important as the consistency of physical properties among the inputs. For
example, for a sort-based implementation of intersection, i.e., an
algorithm very similar to merge-join, any sort order of the two inputs will
suffice as long as the two inputs are sorted in the same way. Similarly,
for a parallel join, any partitioning of join inputs across multiple
processing nodes is acceptable if both inputs are partitioned using
Compatible partitioning rules. For these cases, the search engine permits
the optimizer implementor to specify a number of physical property vectors
to be tried. For example, for the intersection of two inputs R and S with
attributes A, B, and C where R is sorted on (A,B,C) and S is sorted on
(B,A,C), both these sort orders can be specified by the optimizer
implementor and will be optimized by the generated optimizer, while other
possible sort orders, e.g., (C,B,A), will be ignored. " [1]

Best,
Stamatis

[1]
https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf

On Fri, Oct 18, 2019 at 4:56 AM Haisheng Yuan 
wrote:

> TL;DR
> Both top-down physical TraitSet request and bottom-up TraitSet
> derivation have their strongth and weakness, we propose
> on-demand TraitSet request to combine the above two, to reduce
> the number of plan alternatives that are genereated, especially
> in distributed system.
>
> e.g.
> select * from foo join bar on f1=b1 and f2=b2 and f3=b3;
>
> In non-distributed system, we can generate a sort merge join,
> requesting foo sorted by f1,f2,f3 and bar sorted by b1,b2,b3.
> But if foo happens to be sorted by f3,f2,f1, we may miss the
> chance of making use of the delivered ordering of foo. Because
> if we require bar to be sorted by b3,b2,b1, we don't need to
> sort on foo anymore. There are so many choices, n!, not even
> considering asc/desc and null direction. We can't request all
> the possible traitsets in top-down way, and can't derive all the
> possible traitsets in bottom-up way either.
>
> We propose on-demand traitset request by adding a new type
> of metadata DerivedTraitSets into the built-in metadata system.
>
> List deriveTraitSets(RelNode, RelMetadataQuery)
>
> In this metadata, every operator returns several possbile traitsets
> that may be derived from this operator.
>
> Using above query as an example, the tablescan on foo should
> return traiset with collation on f3, f2, f1.
>
> In physical implementation rules, e.g. the SortMergeJoinRule,
> it gets possible traitsets from both child operators, uses the join
> keys to eliminate useless traitsets, leaves out usefull traitsets,
> and requests corresponding traitset on the other child.
>
> This relies on the feature of AbstractConverter, which is turned
> off by default, due to performance issue [1].
>
> Thoughts?
>
> [1] https://issues.apache.org/jira/browse/CALCITE-2970
>
> Haisheng
>
>


Re: Apache Calcite meetup group

2019-10-18 Thread Jesus Camacho Rodriguez
Absolutely!

Thanks,
Jesús

On Thu, Oct 17, 2019 at 3:46 PM Julian Hyde  wrote:

> I would be delighted if you would do that - thank you!
>
> If you are organizing a meet up, please consult with this list. There may
> be people who would be willing to speak and have something interesting to
> say.
>
> Julian
>
> > On Oct 16, 2019, at 3:14 PM, Jesus Camacho Rodriguez <
> jcama...@apache.org> wrote:
> >
> > Hi Julian,
> >
> > I have just seen your message. Although we have other ways to
> communicate,
> > I believe it may be valuable to keep the group even if a meetup has not
> > happened for a while (we may organize some meetups in the future, those
> > interested in Calcite may be subscribed to group to attend talks around
> the
> > project even if they do not follow the project closely through mailing
> > list, etc.). I would be happy to pay the fees to keep it. I have just
> > checked and I think I could simply go ahead and pay, but let me know if I
> > need to do anything else.
> >
> > Thanks,
> > Jesús
> >
> >
> > On Wed, Oct 16, 2019 at 11:21 AM Julian Hyde  wrote:
> >
> >> If you’re a member of the Apache Calcite meetup group[1], you probably
> >> just received an email saying that the group is shutting down. I set it
> up
> >> a few years ago, but I never find time to organize meetups, so I
> decided to
> >> stop paying the annual fee to meetup.com .
> >>
> >> I’m not particularly sad that it’s closing down, given that it has been
> >> inactive, and we as a community seem to find other ways to talk to each
> >> other. But if someone in the community would like to organize some
> meetups
> >> and is prepared to pay the fees, I’m happy to hand over the reins.
> >>
> >> Julian
> >>
> >> [1] https://www.meetup.com/Apache-Calcite/ <
> >> https://www.meetup.com/Apache-Calcite/>
>
>