Re: Assigning jira issue

2019-08-08 Thread Francis Chuang

Hi Xing Jin,

I've added you to the contributor role.

Francis

On 9/08/2019 12:52 pm, XING JIN wrote:

Hi Francis,
I have the same request with Sahith.
Could you please add me to the contributor list as well ?
My user name is -- jin xing
Thanks a lot !

Francis Chuang  于2019年8月9日周五 上午5:58写道:


Hey Sahith,

I've added you to the contributor role and assigned the issue to you.

Francis

On 8/08/2019 11:54 pm, sahith.re...@gmail.com wrote:

Hello,

I created issue CALCITE-3237,

https://issues.apache.org/jira/browse/CALCITE-3237, and I submitted a PR
for it as well. Can this issue be assigned to me? My jira username is
snallapa. Thank you!


Thanks,

Sahith







Re: Assigning jira issue

2019-08-08 Thread XING JIN
Hi Francis,
I have the same request with Sahith.
Could you please add me to the contributor list as well ?
My user name is -- jin xing
Thanks a lot !

Francis Chuang  于2019年8月9日周五 上午5:58写道:

> Hey Sahith,
>
> I've added you to the contributor role and assigned the issue to you.
>
> Francis
>
> On 8/08/2019 11:54 pm, sahith.re...@gmail.com wrote:
> > Hello,
> >
> > I created issue CALCITE-3237,
> https://issues.apache.org/jira/browse/CALCITE-3237, and I submitted a PR
> for it as well. Can this issue be assigned to me? My jira username is
> snallapa. Thank you!
> >
> > Thanks,
> >
> > Sahith
> >
>


Re: Assigning jira issue

2019-08-08 Thread Francis Chuang

Hey Sahith,

I've added you to the contributor role and assigned the issue to you.

Francis

On 8/08/2019 11:54 pm, sahith.re...@gmail.com wrote:

Hello,

I created issue CALCITE-3237, 
https://issues.apache.org/jira/browse/CALCITE-3237, and I submitted a PR for it 
as well. Can this issue be assigned to me? My jira username is snallapa. Thank 
you!

Thanks,

Sahith



Re: Optimization of join between a small table and a large table

2019-08-08 Thread Julian Hyde
When you implement an equi-join as a nested loops join, the right-hand side 
always has a filter combining the variable set by the left-hand side and the 
join column on the right-hand side.

You will need to do a full scan of the right-hand table every time, unless your 
table is organized so that you can access a subset that matches the filter.

In your example

 select emp.name
   from emp join dept on emp.deptid = dept.id
   where dept.name = ‘Sales'

The simple plan would be

 Correlate (v := deptno)
   Filter name = ’Sales'
 Scan dept
   Filter deptno = $v
 Scan emp

If “emp” was physically ordered by deptno, the plan could be optimized to

 Correlate (v := deptno)
   Filter name = ’Sales'
 Scan dept
   RangeScan emp deptno = $v



> On Aug 7, 2019, at 11:41 PM, Gabriel Reid  wrote:
> 
> Hi Stamatis,
> 
> Thank you so much, this is exactly the kind of info I was looking for! I
> think I can figure out how to go forward based on this, thanks again.
> 
> - Gabriel
> 
> On Thu, Aug 8, 2019 at 8:19 AM Stamatis Zampetakis 
> wrote:
> 
>> Hi Gabriel,
>> 
>> What you want indeed seems to be a nested loop join; there has been a
>> relevant discussion in the dev list [1].
>> You may also find relevant the ongoing discussion on CALCITE-2979 [2].
>> 
>> Best,
>> Stamatis
>> 
>> [1]
>> 
>> https://lists.apache.org/thread.html/d9f95683e66009872a53e7e617295158b98746b550d2bf68230b3096@%3Cdev.calcite.apache.org%3E
>> 
>> [2] https://issues.apache.org/jira/browse/CALCITE-2979
>> 
>> On Wed, Aug 7, 2019 at 2:56 PM Gabriel Reid 
>> wrote:
>> 
>>> Hi,
>>> 
>>> I'm currently working on a custom Calcite adapter, and I've got a
>> situation
>>> where I want to join a small table with a large table, basically just
>> using
>>> the small table to filter the large table. Conceptually, I'm talking
>> about
>>> something that is equivalent to the following query:
>>> 
>>>select emp.name
>>>from emp join dept on emp.deptid = dept.id
>>>where dept.name = 'Sales'
>>> 
>>> I've got converter rules which to push down filtering on all tables,
>> which
>>> make a big difference in performance. The above query current results in
>> an
>>> EnumerableHashJoin over a filtered scan over the 'dept' table, and an
>>> unfiltered scan over the 'emp' table.
>>> 
>>> What I would like to accomplish is that this is converted into a (I
>> think)
>>> a nested loop join between 'dept' and emp, so that the filtered scan is
>>> done once over 'dept', and then a filtered scan is done for each entry of
>>> the 'dept' table using the 'id' value from that entry as a filter value
>> on
>>> the scan on the 'emp' table.
>>> 
>>> I've been able to get a nested loop join to be used, but I haven't
>> managed
>>> to have the 'id' values from the 'dept' table to be used to filter the
>>> 'emp' table. Instead, the full 'emp' table is scanned for each iteration
>> of
>>> the 'dept' table.
>>> 
>>> And now my questions: are my hopes/expectations here realistic, and/or is
>>> there a much better/easier way of accomplishing the same thing? Is there
>>> prior art somewhere within the Calcite code base or elsewhere? Or should
>>> this just be working by default?
>>> 
>>> I would assume that this isn't that unusual of a situation, which is why
>> I
>>> was expecting that there would already be something like this somewhere
>> (or
>>> I'm doing the wrong thing), but I haven't managed to find any clear
>>> pointers in any one direction yet.
>>> 
>>> Thanks in advance for any advice!
>>> 
>>> - Gabriel
>>> 
>> 



Assigning jira issue

2019-08-08 Thread sahith . reddy
Hello,

I created issue CALCITE-3237, 
https://issues.apache.org/jira/browse/CALCITE-3237, and I submitted a PR for it 
as well. Can this issue be assigned to me? My jira username is snallapa. Thank 
you!

Thanks,

Sahith 

Filterable table

2019-08-08 Thread Lekshmi
Hi,
  Can anyone explain to me what is the advantage of Filterable Table? How
can we efficiently make use of that in a Federated system?


Thanks and Regards

Lekshmi B.G
Email: lekshmib...@gmail.com


[jira] [Created] (CALCITE-3237) ExpressionWriter not updating indent correctly

2019-08-08 Thread Sahith Nallapareddy (JIRA)
Sahith Nallapareddy created CALCITE-3237:


 Summary: ExpressionWriter not updating indent correctly
 Key: CALCITE-3237
 URL: https://issues.apache.org/jira/browse/CALCITE-3237
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.21.0
Reporter: Sahith Nallapareddy


ExpressionWriter does not seem to be updating the indent correctly as I have 
ran into errors like this

 
{noformat}
java.lang.IndexOutOfBoundsException: Index: 20, Size: 20

at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.begin(ExpressionWriter.java:78)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.begin(ExpressionWriter.java:101)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.list(ExpressionWriter.java:161)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.NewExpression.accept(NewExpression.java:60)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.TernaryExpression.accept(TernaryExpression.java:61)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.GotoStatement.accept0(GotoStatement.java:84)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.Statement.accept(Statement.java:32)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.BlockStatement.accept0(BlockStatement.java:75)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.append(ExpressionWriter.java:129)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ConditionalStatement.accept0(ConditionalStatement.java:62)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.Statement.accept(Statement.java:32)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.BlockStatement.accept0(BlockStatement.java:75)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.append(ExpressionWriter.java:129)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.MethodDeclaration.accept(MethodDeclaration.java:73)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.list(ExpressionWriter.java:167)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.NewExpression.accept(NewExpression.java:62)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.TernaryExpression.accept(TernaryExpression.java:61)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.GotoStatement.accept0(GotoStatement.java:84)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.Statement.accept(Statement.java:32)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.BlockStatement.accept0(BlockStatement.java:75)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.append(ExpressionWriter.java:129)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.MethodDeclaration.accept(MethodDeclaration.java:73)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.list(ExpressionWriter.java:167)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.NewExpression.accept(NewExpression.java:62)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.TernaryExpression.accept(TernaryExpression.java:61)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.GotoStatement.accept0(GotoStatement.java:84)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.Statement.accept(Statement.java:32)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.BlockStatement.accept0(BlockStatement.java:75)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.append(ExpressionWriter.java:129)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ConditionalStatement.accept0(ConditionalStatement.java:62)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.Statement.accept(Statement.java:32)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.BlockStatement.accept0(BlockStatement.java:75)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.append(ExpressionWriter.java:129)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.MethodDeclaration.accept(MethodDeclaration.java:73)
at 
org.apache.beam.repackaged.sql.org.apache.calcite.linq4j.tree.ExpressionWriter.list(ExpressionWriter.java:167)
at 

Re: Optimization of join between a small table and a large table

2019-08-08 Thread Gabriel Reid
Hi Stamatis,

Thank you so much, this is exactly the kind of info I was looking for! I
think I can figure out how to go forward based on this, thanks again.

- Gabriel

On Thu, Aug 8, 2019 at 8:19 AM Stamatis Zampetakis 
wrote:

> Hi Gabriel,
>
> What you want indeed seems to be a nested loop join; there has been a
> relevant discussion in the dev list [1].
> You may also find relevant the ongoing discussion on CALCITE-2979 [2].
>
> Best,
> Stamatis
>
> [1]
>
> https://lists.apache.org/thread.html/d9f95683e66009872a53e7e617295158b98746b550d2bf68230b3096@%3Cdev.calcite.apache.org%3E
>
> [2] https://issues.apache.org/jira/browse/CALCITE-2979
>
> On Wed, Aug 7, 2019 at 2:56 PM Gabriel Reid 
> wrote:
>
> > Hi,
> >
> > I'm currently working on a custom Calcite adapter, and I've got a
> situation
> > where I want to join a small table with a large table, basically just
> using
> > the small table to filter the large table. Conceptually, I'm talking
> about
> > something that is equivalent to the following query:
> >
> > select emp.name
> > from emp join dept on emp.deptid = dept.id
> > where dept.name = 'Sales'
> >
> > I've got converter rules which to push down filtering on all tables,
> which
> > make a big difference in performance. The above query current results in
> an
> > EnumerableHashJoin over a filtered scan over the 'dept' table, and an
> > unfiltered scan over the 'emp' table.
> >
> > What I would like to accomplish is that this is converted into a (I
> think)
> > a nested loop join between 'dept' and emp, so that the filtered scan is
> > done once over 'dept', and then a filtered scan is done for each entry of
> > the 'dept' table using the 'id' value from that entry as a filter value
> on
> > the scan on the 'emp' table.
> >
> > I've been able to get a nested loop join to be used, but I haven't
> managed
> > to have the 'id' values from the 'dept' table to be used to filter the
> > 'emp' table. Instead, the full 'emp' table is scanned for each iteration
> of
> > the 'dept' table.
> >
> > And now my questions: are my hopes/expectations here realistic, and/or is
> > there a much better/easier way of accomplishing the same thing? Is there
> > prior art somewhere within the Calcite code base or elsewhere? Or should
> > this just be working by default?
> >
> > I would assume that this isn't that unusual of a situation, which is why
> I
> > was expecting that there would already be something like this somewhere
> (or
> > I'm doing the wrong thing), but I haven't managed to find any clear
> > pointers in any one direction yet.
> >
> > Thanks in advance for any advice!
> >
> > - Gabriel
> >
>


Re: Optimization of join between a small table and a large table

2019-08-08 Thread Stamatis Zampetakis
Hi Gabriel,

What you want indeed seems to be a nested loop join; there has been a
relevant discussion in the dev list [1].
You may also find relevant the ongoing discussion on CALCITE-2979 [2].

Best,
Stamatis

[1]
https://lists.apache.org/thread.html/d9f95683e66009872a53e7e617295158b98746b550d2bf68230b3096@%3Cdev.calcite.apache.org%3E

[2] https://issues.apache.org/jira/browse/CALCITE-2979

On Wed, Aug 7, 2019 at 2:56 PM Gabriel Reid  wrote:

> Hi,
>
> I'm currently working on a custom Calcite adapter, and I've got a situation
> where I want to join a small table with a large table, basically just using
> the small table to filter the large table. Conceptually, I'm talking about
> something that is equivalent to the following query:
>
> select emp.name
> from emp join dept on emp.deptid = dept.id
> where dept.name = 'Sales'
>
> I've got converter rules which to push down filtering on all tables, which
> make a big difference in performance. The above query current results in an
> EnumerableHashJoin over a filtered scan over the 'dept' table, and an
> unfiltered scan over the 'emp' table.
>
> What I would like to accomplish is that this is converted into a (I think)
> a nested loop join between 'dept' and emp, so that the filtered scan is
> done once over 'dept', and then a filtered scan is done for each entry of
> the 'dept' table using the 'id' value from that entry as a filter value on
> the scan on the 'emp' table.
>
> I've been able to get a nested loop join to be used, but I haven't managed
> to have the 'id' values from the 'dept' table to be used to filter the
> 'emp' table. Instead, the full 'emp' table is scanned for each iteration of
> the 'dept' table.
>
> And now my questions: are my hopes/expectations here realistic, and/or is
> there a much better/easier way of accomplishing the same thing? Is there
> prior art somewhere within the Calcite code base or elsewhere? Or should
> this just be working by default?
>
> I would assume that this isn't that unusual of a situation, which is why I
> was expecting that there would already be something like this somewhere (or
> I'm doing the wrong thing), but I haven't managed to find any clear
> pointers in any one direction yet.
>
> Thanks in advance for any advice!
>
> - Gabriel
>