Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

2019-10-08 Thread Julian Hyde
There’s not very much difference between supporting IN in RexCalls vs. 
supporting $HARD_IN in RexCalls. The distinction is important to me because I 
think it’s important to have a “core” set of operators, and I don’t want to add 
IN to that core.  Stamatis feels differently, and I respect that.

One thing that makes me thing that IN does not belong in “core” is that there 
is a generalization range-sets. With IN we can represent “x = 1 or x = 3 or x = 
5”, but with range sets we can also represent “x = 1 or x between 3 and 5 or x> 
100”. We make heavy use of range sets in DateRangeRules.

Julian


> On Oct 8, 2019, at 2:47 PM, Xiening Dai  wrote:
> 
> In my opinion, we will need both - supporting IN operation (either through an 
> operator or an internal function) and adding support for building a balanced 
> tree. It’s always good to be resilient and capable of handling edge cases. 
> The IN support might require more work. Haisheng’s proposal is a practical 
> solution to current issue.
> 
>> On Oct 8, 2019, at 11:06 AM, Haisheng Yuan  wrote:
>> 
>> Adding IN RexNode only partially solves the problem, as it is still masking 
>> the underlying issue. The fundamental reason for the stack overflow iies in 
>> the left-deep binary tree. For queries that have tens of thousands of OR 
>> condition, but not equals, which is not uncommon in our case, e.g.
>> (a like '...') or (b like '...') or (c like '..')
>> there will still be stack overflow. 
>> 
>> - Haisheng
>> 
>> --
>> 发件人:Stamatis Zampetakis
>> 日 期:2019年10月08日 15:09:01
>> 收件人:
>> 主 题:Re: Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
>> 
>> It might be better to add a proper IN operator in RexCalls instead of
>> something internal that does more or less the same thing.
>> It is true that adds more paths in the code and thus requires some
>> additional dev and further support but I think it is worth it.
>> Many people so far expressed an interest to work on various cases involving
>> an IN operator so it might not be long before
>> we have full support for the IN operator.
>> 
>> SqlToRelConverter can still decide to expand or not based on some criterion
>> or property.
>> 
>> 
>> On Tue, Oct 8, 2019 at 3:37 AM Julian Hyde  wrote:
>> 
>>> A SqlCall to $HARD_IN will (by SqlToRelConverter) become a RexCall to
>>> $HARD_IN, and then (by RelToSqlConverter) become a SqlCall to
>>> $HARD_IN. $HARD_IN(x, v1, v2) would become (by SqlWriter) the SQL "x
>>> IN (v1, v2)".
>>> 
>>> At any point in this lifecycle, you could intercept and and simplify.
>>> 
>>> On Mon, Oct 7, 2019 at 2:34 PM Haisheng Yuan 
>>> wrote:
 
 Will the filter condition with “$HARD_IN” internal function be able to
>>> pushed down and be recognized by the source SQL system, like Peter
>>> mentioned?
 
 If not, we have to translate the internal function back to IN during
>>> Rel2Sql phase. Otherwise, the data read from the source table can be much
>>> larger.
 
 - Haisheng
 
 --
 发件人:Julian Hyde
 日 期:2019年10月08日 04:53:11
 收件人:dev
 主 题:Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
 
 In
>>> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946209
>>> <
>>> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16946209>
>>> I floated the idea of a “$HARD_IN” internal function that has the same
>>> semantics as IN but is not expanded to ‘… = OR … = …’.
 
 I think it would be a useful tool, if used judiciously.
 
 Julian
 
 
> On Oct 4, 2019, at 7:08 PM, Haisheng Yuan 
>>> wrote:
> 
> As a workaround, you can modify you SqlRexConverlet, create a RexCall
>>> with balanced binary tree, e.g. (a=1 or a=2) or (a=3 or a=4), instead of a
>>> flat RexCall with multiple operands, e.g. a=1 or a=2 or a=3 or a=4.
> Because every OR RexCall has exactly 2 operands, it won't transform
>>> into SqlCall with left deep tree.
> 
> Let me know it works for you or not.
> 
> - Haisheng
> 
> --
> 发件人:Haisheng Yuan
> 日 期:2019年10月05日 07:37:04
> 收件人:Peter Wicks (pwicks); dev@calcite.apache.org<
>>> dev@calcite.apache.org>
> 主 题:Re: RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
>>> Logic
> 
> If you want to push the filter down to the source SQL sytem, then
>>> transforming to a join won't help you either.
> 
> The reason of stackoverflow for large ORs is the left deep binary
>>> tree, we need to change it to balanced binary tree, to reduce the depth of
>>> the call.
> 
> I will open a pull request later.
> 
> - 

Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

2019-10-08 Thread Xiening Dai
In my opinion, we will need both - supporting IN operation (either through an 
operator or an internal function) and adding support for building a balanced 
tree. It’s always good to be resilient and capable of handling edge cases. The 
IN support might require more work. Haisheng’s proposal is a practical solution 
to current issue.

> On Oct 8, 2019, at 11:06 AM, Haisheng Yuan  wrote:
> 
> Adding IN RexNode only partially solves the problem, as it is still masking 
> the underlying issue. The fundamental reason for the stack overflow iies in 
> the left-deep binary tree. For queries that have tens of thousands of OR 
> condition, but not equals, which is not uncommon in our case, e.g.
> (a like '...') or (b like '...') or (c like '..')
> there will still be stack overflow. 
> 
> - Haisheng
> 
> --
> 发件人:Stamatis Zampetakis
> 日 期:2019年10月08日 15:09:01
> 收件人:
> 主 题:Re: Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
> 
> It might be better to add a proper IN operator in RexCalls instead of
> something internal that does more or less the same thing.
> It is true that adds more paths in the code and thus requires some
> additional dev and further support but I think it is worth it.
> Many people so far expressed an interest to work on various cases involving
> an IN operator so it might not be long before
> we have full support for the IN operator.
> 
> SqlToRelConverter can still decide to expand or not based on some criterion
> or property.
> 
> 
> On Tue, Oct 8, 2019 at 3:37 AM Julian Hyde  wrote:
> 
>> A SqlCall to $HARD_IN will (by SqlToRelConverter) become a RexCall to
>> $HARD_IN, and then (by RelToSqlConverter) become a SqlCall to
>> $HARD_IN. $HARD_IN(x, v1, v2) would become (by SqlWriter) the SQL "x
>> IN (v1, v2)".
>> 
>> At any point in this lifecycle, you could intercept and and simplify.
>> 
>> On Mon, Oct 7, 2019 at 2:34 PM Haisheng Yuan 
>> wrote:
>>> 
>>> Will the filter condition with “$HARD_IN” internal function be able to
>> pushed down and be recognized by the source SQL system, like Peter
>> mentioned?
>>> 
>>> If not, we have to translate the internal function back to IN during
>> Rel2Sql phase. Otherwise, the data read from the source table can be much
>> larger.
>>> 
>>> - Haisheng
>>> 
>>> --
>>> 发件人:Julian Hyde
>>> 日 期:2019年10月08日 04:53:11
>>> 收件人:dev
>>> 主 题:Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
>>> 
>>> In
>> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946209
>> <
>> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16946209>
>> I floated the idea of a “$HARD_IN” internal function that has the same
>> semantics as IN but is not expanded to ‘… = OR … = …’.
>>> 
>>> I think it would be a useful tool, if used judiciously.
>>> 
>>> Julian
>>> 
>>> 
 On Oct 4, 2019, at 7:08 PM, Haisheng Yuan 
>> wrote:
 
 As a workaround, you can modify you SqlRexConverlet, create a RexCall
>> with balanced binary tree, e.g. (a=1 or a=2) or (a=3 or a=4), instead of a
>> flat RexCall with multiple operands, e.g. a=1 or a=2 or a=3 or a=4.
 Because every OR RexCall has exactly 2 operands, it won't transform
>> into SqlCall with left deep tree.
 
 Let me know it works for you or not.
 
 - Haisheng
 
 --
 发件人:Haisheng Yuan
 日 期:2019年10月05日 07:37:04
 收件人:Peter Wicks (pwicks); dev@calcite.apache.org<
>> dev@calcite.apache.org>
 主 题:Re: RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
>> Logic
 
 If you want to push the filter down to the source SQL sytem, then
>> transforming to a join won't help you either.
 
 The reason of stackoverflow for large ORs is the left deep binary
>> tree, we need to change it to balanced binary tree, to reduce the depth of
>> the call.
 
 I will open a pull request later.
 
 - Haisheng
 
 --
 发件人:Peter Wicks (pwicks)
 日 期:2019年10月04日 21:32:25
 收件人:dev@calcite.apache.org
 主 题:RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
>> Logic
 
 Zoltan,
 
 Thanks for the suggestion. I actually tried doing a UDF first, and it
>> was also successful, sorry for not sharing those details earlier.
 The problem with the UDF is that the predicates are not pushed down to
>> the source SQL system (by design), and this can result in a 100x increase
>> in the amount of data returned from the database. This data will be
>> correctly filtered by the UDF, but returning 100x the data makes it a lot
>> slower. So I was trying to push it down to the 

Re: Ignite community is building Calcite-based prototype

2019-10-08 Thread Denis Magda
(Looping in the dev lists as suggested by Julian.)

Julian, Calcite community, let me know what you think about this option.
We're planning to host an Apache Ignite meetup with our group [1]
on November 13th or 14th in the San Francisco Bay Area. The desired topic
is the new Ignite SQL engine based on Calcite. Alex Goncharuk will be
representing the Ignite community doing the presentation.

Is there anybody from the Calcite community (Julian or mister X) who can
join the meetup and do a Calcite-intro talk? Alex will be presenting next.
We would also invite the members of Calcite Bay Area Group. Ignite
community will take care of all org hurdles.

[1] https://www.meetup.com/Bay-Area-In-Memory-Computing/
[2] https://www.meetup.com/Apache-Calcite/

-
Denis


On Mon, Oct 7, 2019 at 1:07 PM Julian Hyde  wrote:

> I’m just about to let that meetup group lapse. I have never had the time &
> energy to organize meetups.
>
> Maybe we could do an online meeting, that the community could attend?
> (Subject to the limits set by Zoom or Hangouts or whatever.) Feel free to
> suggest something on the dev list.
>
> Julian
>
>
> On Oct 3, 2019, at 4:16 PM, Denis Magda  wrote:
>
> Julian,
>
> I just found out that you run an Apache Calcite meetup in Palo Alto:
> https://www.meetup.com/Apache-Calcite/
>
> What do you think if you schedule a meetup in the mid of November? I'm
> based in Silicon Valley and Alex Goncharuk (one of Ignite veterans and
> architects) is visiting that month. Alex and I can talk about the reasons
> why Calcite was selected by Ignite community, covering our architecture of
> today and how it's planned to be changed with Calcite. Plus, the group can
> give valuable feedback.
>
> -
> Denis
>
>
> -- Forwarded message -
> From: Denis Magda 
> Date: Wed, Oct 2, 2019 at 3:32 PM
> Subject: Re: Ignite community is building Calcite-based prototype
> To: dev 
> Cc: dev , dev 
>
>
> Julian,
>
> Thanks a lot for the references and guidance! I do believe that from now
> on our community guys will become frequent visitors of yours ;)
>
> -
> Denis
>
>
> On Wed, Oct 2, 2019 at 12:40 PM Julian Hyde  wrote:
>
>> Denis,
>>
>> I’ve been a follower and admirer of Ignite for several years, so I am
>> delighted that you are considering Calcite.
>>
>> Ask questions on the dev list, log JIRA cases if you find them, and we’ll
>> do our best to help.
>>
>> I’d like to bring to your attention RelBuilder. Some people want to go
>> from SQL text to executable plan, but others want to drop in at the
>> relational algebra level, and RelBuilder is a convenient interface for the
>> latter.
>>
>> (The other) Julian
>>
>>
>> > On Oct 1, 2019, at 3:43 PM, Denis Magda  wrote:
>> >
>> > Hi Julian,
>> >
>> > Nice to e-meet you and thanks for being ready to help! Hopefully, the
>> > Ignite community will be able to contribute valuable changes back to
>> > Calcite as part of this activity - "pay good for good" :)
>> >
>> > You are right that distributed computing, massive-parallel processing,
>> and
>> > calculations/querying at scale is what Ignite is targeted for. However,
>> > while Drill is designed for analytics and IoTDB is for time-series,
>> Ignite
>> > is primarily used for OLTP with an increasing number of real-time
>> analytics
>> > use cases (no adhoc).
>> >
>> > Let's stay in touch!
>> >
>> > -
>> > Denis
>> >
>> >
>> > On Tue, Oct 1, 2019 at 6:42 AM Julian Feinauer <
>> j.feina...@pragmaticminds.de>
>> > wrote:
>> >
>> >> Hi Igor,
>> >>
>> >> I agree that it should be rather similar to what Drill did as
>> distributed
>> >> computing also is a big concern for Ignite, I guess, right?
>> >>
>> >> Julian
>> >>
>> >> Am 01.10.19, 15:06 schrieb "Seliverstov Igor" :
>> >>
>> >>Guys,
>> >>
>> >>The better link:
>> >>
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
>> >> <
>> >>
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37:+New+query+execution+engine
>> >>>
>> >>
>> >>Almost everything you may see by the link is the same as Drill guys
>> >> already did, the difference is in details but the idea is the same.
>> >>
>> >>Of course we’ll face many issues while development and I'll
>> appreciate
>> >> if some of you assist us.
>> >>
>> >>Regards,
>> >>Igor
>> >>
>> >>> 1 окт. 2019 г., в 12:32, Julian Feinauer <
>> >> j.feina...@pragmaticminds.de> написал(а):
>> >>>
>> >>> Hi Denis,
>> >>>
>> >>> Nice to hear from you and the ignite team... that sounds like an
>> >> excellent idea. I liked the idea of Ignite since I heard about it (I
>> think
>> >> when it became TLP back then). So I would be happy to help you if you
>> have
>> >> specific questions... I‘m currently working on a related topic, namely
>> >> integrate calcite as SQL Layer into Apache IoTDB .
>> >>>
>> >>> Best
>> >>> Julian
>> >>>
>> >>> Holen Sie sich Outlook für iOS
>> >>> 
>> >>> Von: Denis Magda 
>> >>> Gesendet: 

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-08 Thread Julian Hyde
Is there a possibility that data structures will be corrupted, if a thread is 
interrupted in the middle of an operation?

Supposing that we allow resume, is it possible to safely resume after an 
interrupt?

Supposing that we do not allow resume, and instead call close on the root 
Enumerable, is it possible to guarantee each Enumerator cleans up after itself?

Is there a period during the lifecycle of a tree of Enumerable objects (e.g. 
initialization, tear down) where we do not allow interrupts?

How would we test this?

Julian
 

> On Oct 8, 2019, at 10:48 AM, Haisheng Yuan  wrote:
> 
> Make sense and quite reasonable.
> 
> - Haisheng
> 
> --
> 发件人:Stamatis Zampetakis
> 日 期:2019年10月08日 18:04:17
> 收件人:
> 主 题:[DISCUSS] Make Enumerable operators responsive to interrupts
> 
> Hello,
> 
> There are many use-cases which require stopping/cancelling the execution of
> a query for various reasons. Currently, this can be done by launching the
> query in a separate thread and then setting
> DataContext.Variable.CANCEL_FLAG [1] accordingly.
> 
> However if the tread executing the query gets interrupted through the usual
> Thread.interrupt() mechanism the query execution will not stop since the
> operators are not responsive to interruption.
> 
> How do you feel about making Enumerable operators responsive to interrupts?
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87
> 



Re: Community and News pages "not secure", according to Chrome

2019-10-08 Thread Michael Mior
Your PR is a superset of the changes I made. Links to insecure pages
will not cause the problem that Julian experienced, only resources
which are requested when rendering the page (images, CSS, JS, etc.) I
still think we should fix any HTTPS links. Have you verified that all
the sites whose links you replaced actually support HTTPS? Just want
to make sure we don't end up with anything broken, but I think we
should merge your PR as well. Thanks!
--
Michael Mior
mm...@uwaterloo.ca

Le mar. 8 oct. 2019 à 13:16, Stamatis Zampetakis  a écrit :
>
> I fired CALCITE-3391 and its respective PR this morning.
>
> @Michael: If your changes are sufficient then feel free to close the
> respective JIRA/PR.
>
> Best,
> Stamatis
>
> On Tue, Oct 8, 2019, 6:49 PM Michael Mior  wrote:
>
> > Pushed and this seems to have resolved the issue for me.
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le mar. 8 oct. 2019 à 12:39, Michael Mior  a écrit :
> > >
> > > Most of the warnings I see are like the below. Switching this to HTTPS
> > > should solve most of the issues. I'll make the change.
> > >
> > > http://github.com/zabetak.png
> > >
> > > --
> > > Michael Mior
> > > mm...@apache.org
> > >
> > > Le lun. 7 oct. 2019 à 15:31, Julian Hyde  a écrit :
> > > >
> > > > When I open https://calcite.apache.org/news/ <
> > https://calcite.apache.org/news/> or https://calcite.apache.org/community/
> >  pages in Chrome, I get an (i) to
> > the left of the URL bar with the message “Your connection to this site is
> > not fully secure” and a warning on the right side of the URL bar with
> > “Cookies blocked”.
> > > >
> > > > Other pages on the site seem to be OK - they display a padlock rather
> > than an (i) at the left of the URL, indicating that the site is secure.
> > > >
> > > > Any idea what is wrong with the Community and News pages?
> > > >
> > > > Julian
> > > >
> >


Re: Re: Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

2019-10-08 Thread Haisheng Yuan
Adding IN RexNode only partially solves the problem, as it is still masking the 
underlying issue. The fundamental reason for the stack overflow iies in the 
left-deep binary tree. For queries that have tens of thousands of OR condition, 
but not equals, which is not uncommon in our case, e.g.
(a like '...') or (b like '...') or (c like '..')
there will still be stack overflow. 

- Haisheng

--
发件人:Stamatis Zampetakis
日 期:2019年10月08日 15:09:01
收件人:
主 题:Re: Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

It might be better to add a proper IN operator in RexCalls instead of
something internal that does more or less the same thing.
It is true that adds more paths in the code and thus requires some
additional dev and further support but I think it is worth it.
Many people so far expressed an interest to work on various cases involving
an IN operator so it might not be long before
we have full support for the IN operator.

SqlToRelConverter can still decide to expand or not based on some criterion
or property.


On Tue, Oct 8, 2019 at 3:37 AM Julian Hyde  wrote:

> A SqlCall to $HARD_IN will (by SqlToRelConverter) become a RexCall to
> $HARD_IN, and then (by RelToSqlConverter) become a SqlCall to
> $HARD_IN. $HARD_IN(x, v1, v2) would become (by SqlWriter) the SQL "x
> IN (v1, v2)".
>
> At any point in this lifecycle, you could intercept and and simplify.
>
> On Mon, Oct 7, 2019 at 2:34 PM Haisheng Yuan 
> wrote:
> >
> > Will the filter condition with “$HARD_IN” internal function be able to
> pushed down and be recognized by the source SQL system, like Peter
> mentioned?
> >
> > If not, we have to translate the internal function back to IN during
> Rel2Sql phase. Otherwise, the data read from the source table can be much
> larger.
> >
> > - Haisheng
> >
> > --
> > 发件人:Julian Hyde
> > 日 期:2019年10月08日 04:53:11
> > 收件人:dev
> > 主 题:Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
> >
> > In
> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946209
> <
> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16946209>
> I floated the idea of a “$HARD_IN” internal function that has the same
> semantics as IN but is not expanded to ‘… = OR … = …’.
> >
> > I think it would be a useful tool, if used judiciously.
> >
> > Julian
> >
> >
> > > On Oct 4, 2019, at 7:08 PM, Haisheng Yuan 
> wrote:
> > >
> > > As a workaround, you can modify you SqlRexConverlet, create a RexCall
> with balanced binary tree, e.g. (a=1 or a=2) or (a=3 or a=4), instead of a
> flat RexCall with multiple operands, e.g. a=1 or a=2 or a=3 or a=4.
> > > Because every OR RexCall has exactly 2 operands, it won't transform
> into SqlCall with left deep tree.
> > >
> > > Let me know it works for you or not.
> > >
> > > - Haisheng
> > >
> > > --
> > > 发件人:Haisheng Yuan
> > > 日 期:2019年10月05日 07:37:04
> > > 收件人:Peter Wicks (pwicks); dev@calcite.apache.org<
> dev@calcite.apache.org>
> > > 主 题:Re: RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
> Logic
> > >
> > > If you want to push the filter down to the source SQL sytem, then
> transforming to a join won't help you either.
> > >
> > > The reason of stackoverflow for large ORs is the left deep binary
> tree, we need to change it to balanced binary tree, to reduce the depth of
> the call.
> > >
> > > I will open a pull request later.
> > >
> > > - Haisheng
> > >
> > > --
> > > 发件人:Peter Wicks (pwicks)
> > > 日 期:2019年10月04日 21:32:25
> > > 收件人:dev@calcite.apache.org
> > > 主 题:RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
> Logic
> > >
> > > Zoltan,
> > >
> > > Thanks for the suggestion. I actually tried doing a UDF first, and it
> was also successful, sorry for not sharing those details earlier.
> > > The problem with the UDF is that the predicates are not pushed down to
> the source SQL system (by design), and this can result in a 100x increase
> in the amount of data returned from the database. This data will be
> correctly filtered by the UDF, but returning 100x the data makes it a lot
> slower. So I was trying to push it down to the source server instead.
> > >
> > > What do you mean by, "I guess Calcite might probably won't be able to
> do much with these ORs anyway..."? From my experiments I've seen two
> results from passing in this many OR's:
> > >
> > > - If no other predicates are included in the query, then Calcite
> succeeds! It leaves the OR's flat, (a=1 OR a=2 OR a=3 OR a=4)
> > > - If additional predicates are included, then Calcite nests the OR
> statements, leading to a stackoverflow 

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-08 Thread Haisheng Yuan
Make sense and quite reasonable.

- Haisheng

--
发件人:Stamatis Zampetakis
日 期:2019年10月08日 18:04:17
收件人:
主 题:[DISCUSS] Make Enumerable operators responsive to interrupts

Hello,

There are many use-cases which require stopping/cancelling the execution of
a query for various reasons. Currently, this can be done by launching the
query in a separate thread and then setting
DataContext.Variable.CANCEL_FLAG [1] accordingly.

However if the tread executing the query gets interrupted through the usual
Thread.interrupt() mechanism the query execution will not stop since the
operators are not responsive to interruption.

How do you feel about making Enumerable operators responsive to interrupts?

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87



Re: Community and News pages "not secure", according to Chrome

2019-10-08 Thread Stamatis Zampetakis
I fired CALCITE-3391 and its respective PR this morning.

@Michael: If your changes are sufficient then feel free to close the
respective JIRA/PR.

Best,
Stamatis

On Tue, Oct 8, 2019, 6:49 PM Michael Mior  wrote:

> Pushed and this seems to have resolved the issue for me.
> --
> Michael Mior
> mm...@apache.org
>
> Le mar. 8 oct. 2019 à 12:39, Michael Mior  a écrit :
> >
> > Most of the warnings I see are like the below. Switching this to HTTPS
> > should solve most of the issues. I'll make the change.
> >
> > http://github.com/zabetak.png
> >
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le lun. 7 oct. 2019 à 15:31, Julian Hyde  a écrit :
> > >
> > > When I open https://calcite.apache.org/news/ <
> https://calcite.apache.org/news/> or https://calcite.apache.org/community/
>  pages in Chrome, I get an (i) to
> the left of the URL bar with the message “Your connection to this site is
> not fully secure” and a warning on the right side of the URL bar with
> “Cookies blocked”.
> > >
> > > Other pages on the site seem to be OK - they display a padlock rather
> than an (i) at the left of the URL, indicating that the site is secure.
> > >
> > > Any idea what is wrong with the Community and News pages?
> > >
> > > Julian
> > >
>


Re: Community and News pages "not secure", according to Chrome

2019-10-08 Thread Michael Mior
Pushed and this seems to have resolved the issue for me.
--
Michael Mior
mm...@apache.org

Le mar. 8 oct. 2019 à 12:39, Michael Mior  a écrit :
>
> Most of the warnings I see are like the below. Switching this to HTTPS
> should solve most of the issues. I'll make the change.
>
> http://github.com/zabetak.png
>
> --
> Michael Mior
> mm...@apache.org
>
> Le lun. 7 oct. 2019 à 15:31, Julian Hyde  a écrit :
> >
> > When I open https://calcite.apache.org/news/ 
> >  or https://calcite.apache.org/community/ 
> >  pages in Chrome, I get an (i) to 
> > the left of the URL bar with the message “Your connection to this site is 
> > not fully secure” and a warning on the right side of the URL bar with 
> > “Cookies blocked”.
> >
> > Other pages on the site seem to be OK - they display a padlock rather than 
> > an (i) at the left of the URL, indicating that the site is secure.
> >
> > Any idea what is wrong with the Community and News pages?
> >
> > Julian
> >


Re: Community and News pages "not secure", according to Chrome

2019-10-08 Thread Michael Mior
Most of the warnings I see are like the below. Switching this to HTTPS
should solve most of the issues. I'll make the change.

http://github.com/zabetak.png

--
Michael Mior
mm...@apache.org

Le lun. 7 oct. 2019 à 15:31, Julian Hyde  a écrit :
>
> When I open https://calcite.apache.org/news/ 
>  or https://calcite.apache.org/community/ 
>  pages in Chrome, I get an (i) to the 
> left of the URL bar with the message “Your connection to this site is not 
> fully secure” and a warning on the right side of the URL bar with “Cookies 
> blocked”.
>
> Other pages on the site seem to be OK - they display a padlock rather than an 
> (i) at the left of the URL, indicating that the site is secure.
>
> Any idea what is wrong with the Community and News pages?
>
> Julian
>


[DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-08 Thread Stamatis Zampetakis
Hello,

There are many use-cases which require stopping/cancelling the execution of
a query for various reasons. Currently, this can be done by launching the
query in a separate thread and then setting
DataContext.Variable.CANCEL_FLAG [1] accordingly.

However if the tread executing the query gets interrupted through the usual
Thread.interrupt() mechanism the query execution will not stop since the
operators are not responsive to interruption.

How do you feel about making Enumerable operators responsive to interrupts?

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87


Re: Github Actions for CI

2019-10-08 Thread Vladimir Sitnikov
>Definitely would simplify some of
>the RM steps for a release.

I've recently added GitHub Actions config to test Apache JMeter for Windows
and macOS.
So far I'm impressed.

Pros:
* It starts quite fast. Appveyor might take 2-3hours to even start the
build.
Actions start very fast, and it does catch Windows-specific issues like
CRLF, "un-ability to remove a file that is not closed" and "un-ability to
remove a read-only file"

Cons:
* It always fetches all the refs from the Git repository (not just
master+pr, but it fetches even gh-pages branch) which causes issues like
https://github.com/junit-team/junit5/issues/2048 . It is fixable, but the
defaults are odd.
* Caching seems to be not there. Dependencies seem to be downloaded on each
build.
* There's a lot of jitter. JMeter does have several tests like
"thread.sleep(200) + assert actual duration", and they fail way more often
in Actions.
* GitHub Actions is likely unavailable for the forked repositories. In
other words, I cannot easily use Actions in my fork repository. I can use
Travis, and Calcite's travis.yml works for my repository. However, if
Calcite migrates to Actions, then forks would be harder to test (PR would
be required).

Vladimir


Re: ApacheCon Europe 2019 talks which are relevant to Apache Calcite

2019-10-08 Thread Stamatis Zampetakis
https://github.com/apache/calcite/pull/1489

On Mon, Oct 7, 2019 at 9:48 PM Julian Hyde  wrote:

> I feel remiss in filling out
> https://calcite.apache.org/community/#upcoming-talks <
> https://calcite.apache.org/community/#upcoming-talks>. I’d be grateful if
> someone would remove ApacheCon NA and add ApacheCon Europe and log a PR.
>
> > On Oct 7, 2019, at 12:15 PM, Chris Baynes  wrote:
> >
> > Hi!
> >
> > I'll be giving a talk on "Fast federated SQL with Apache Calcite".
> > Would be great to meet up with any other Calciters attending!
> >
> > See you there
> >
> > Chris
> >
> > On Mon, Oct 7, 2019 at 4:01 PM Julian Feinauer <
> j.feina...@pragmaticminds.de>
> > wrote:
> >
> >> Hi all,
> >>
> >> are there any Calcite related talks in Berlin or any Calciters
> attending?
> >> I will be there.
> >>
> >> JulianF
> >>
> >> Am 04.10.19, 19:09 schrieb "my...@apache.org" :
> >>
> >>Dear Apache Calcite committers,
> >>
> >>In a little over 2 weeks time, ApacheCon Europe is taking place in
> >>Berlin. Join us from October 22 to 24 for an exciting program and
> >> lovely
> >>get-together of the Apache Community.
> >>
> >>We are also planning a hackathon.  If your project is interested in
> >>participating, please enter yourselves here:
> >>https://cwiki.apache.org/confluence/display/COMDEV/Hackathon
> >>
> >>The following talks should be especially relevant for you:
> >>
> >>  * *
> >> https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite*
> >>
> >>
> >><
> >>
> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
> >>>
> >>
> >>  *
> >>
> >>
> >>
> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
> >><
> >>
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> >>>
> >>
> >>  *
> >>
> >>
> >>
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> >><
> >>
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> >>>
> >>
> >>  *
> >>
> >>
> >>
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> >>
> >>Furthermore there will be a whole conference track on community
> >> topics:
> >>Learn how to motivate users to contribute patches, how the board of
> >>directors works, how to navigate the Incubator and much more:
> >> ApacheCon
> >>Europe 2019 Community track <
> >> https://aceu19.apachecon.com/sessions?track=42>
> >>
> >>Tickets are available here <
> https://aceu19.apachecon.com/registration>
> >> –
> >>for Apache Committers we offer discounted tickets.  Prices will be
> >> going
> >>up on October 7th, so book soon.
> >>
> >>Please also help spread the word and make ApacheCon Europe 2019 a
> >> success!
> >>
> >>We’re looking forward to welcoming you at #ACEU19!
> >>
> >>Best,
> >>
> >>Your ApacheCon team
> >>
> >>
> >>
>
>


Re: [calcite] branch vvysotskyi-CALCITE-2018 created (now 4ffb668)

2019-10-08 Thread Julian Hyde
Did you intend to push this branch?

Julian

> On Oct 7, 2019, at 8:48 PM, danny0...@apache.org wrote:
> 
> This is an automated email from the ASF dual-hosted git repository.
> 
> danny0405 pushed a change to branch vvysotskyi-CALCITE-2018
> in repository https://gitbox.apache.org/repos/asf/calcite.git.
> 
> 
>  at 4ffb668  [CALCITE-2018] Queries failed with AssertionError: rel has 
> lower cost than best cost of subset
> 
> This branch includes the following new commits:
> 
> new 4ffb668  [CALCITE-2018] Queries failed with AssertionError: rel has 
> lower cost than best cost of subset
> 
> The 1 revisions listed above as "new" are entirely new to this
> repository and will be described in separate emails.  The revisions
> listed as "add" were already present in the repository and have only
> been added to this reference.
> 
> 


Hotfix release for CALCITE-3347 ?

2019-10-08 Thread Enrico Olivelli
Hello Calciters,
in HerdDB community we are have recently upgraded Calcite to 1.21.0 but we
are getting feedback from downstream project about a bad problem with
subqueries with bind variables.

The issue is tracked as CALCITE-3347
https://issues.apache.org/jira/browse/CALCITE-3347

This is the issue on HerdDB bug tracker with the details
https://github.com/diennea/herddb/issues/479

Current master of Calcite 1.22.0-SNAPSHOT works perfectly.

Is there already a plan for 1.22.0 or a perhaps an 1.21.1 hotfix ?

Best regards
Enrico


[jira] [Created] (CALCITE-3391) Insecure pages warning on Chrome

2019-10-08 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-3391:


 Summary: Insecure pages warning on Chrome
 Key: CALCITE-3391
 URL: https://issues.apache.org/jira/browse/CALCITE-3391
 Project: Calcite
  Issue Type: Bug
  Components: site
Affects Versions: 1.21.0
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Fix For: 1.22.0


The problem as identified by [~apilloud] is the use of insecure (plain http) 
links through a secure (https) connection.

For more info see the 
[discussion|https://lists.apache.org/thread.html/e6ed9abfbb105b95fdbeb0d2281d9ee11290a4f5af466f54438d8575@%3Cdev.calcite.apache.org%3E]
 on the dev list.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Adding RelOptMaterializations to a planner

2019-10-08 Thread XING JIN
Hi Shubham,

In my understanding, same RelOptPlanner is the way to go, as one
VolcanoPlanner#registerMaterializations calls
RelOptMaterializations#useMaterializedViews and picks the best algebra
expression tree.
I'd suggest to create all the materializations into a
VolcanoPlanner and then "findBestExp".
Please note that there are two kinds of strategies for materialized view
optimization:
1. Substitution based materialized view optimization
2. Rewriting using plan structural information

For the first one, only RelOptMaterializations#useMaterializedViews is
enough, you can even don't rely on VolcanoPlanner, i.e. call
RelOptMaterializations#useMaterializedViews explicitly and pick the best
one by yourself. But for the second one, you need to to rely on
VolcanoPlanner and register corresponding rules from
AbstractMaterializedViewRule. The second one tend to be smarter but only
supports SPJA pattern.

In short, when you enable config of "materializationEnabled" for connection
property, both of the two strategies above are enabled.

In addition, I'd suggest to do canonicalization before materialized view
optimization, which helps a lot for materialized view matching.

I'm also doing some work for materialized view optimization. It would be
great to have more discussion on this :)
https://issues.apache.org/jira/browse/CALCITE-3334
https://docs.google.com/document/d/1JpwGNFE3hw3yXb7W3-95-jXKClZC5UFPKbuhgYDuEu4/edit#heading=h.bmvjxz1h5evc

Best,
Jin


Re: Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic

2019-10-08 Thread Stamatis Zampetakis
It might be better to add a proper IN operator in RexCalls instead of
something internal that does more or less the same thing.
It is true that adds more paths in the code and thus requires some
additional dev and further support but I think it is worth it.
Many people so far expressed an interest to work on various cases involving
an IN operator so it might not be long before
we have full support for the IN operator.

SqlToRelConverter can still decide to expand or not based on some criterion
or property.


On Tue, Oct 8, 2019 at 3:37 AM Julian Hyde  wrote:

> A SqlCall to $HARD_IN will (by SqlToRelConverter) become a RexCall to
> $HARD_IN, and then (by RelToSqlConverter) become a SqlCall to
> $HARD_IN. $HARD_IN(x, v1, v2) would become (by SqlWriter) the SQL "x
> IN (v1, v2)".
>
> At any point in this lifecycle, you could intercept and and simplify.
>
> On Mon, Oct 7, 2019 at 2:34 PM Haisheng Yuan 
> wrote:
> >
> > Will the filter condition with “$HARD_IN” internal function be able to
> pushed down and be recognized by the source SQL system, like Peter
> mentioned?
> >
> > If not, we have to translate the internal function back to IN during
> Rel2Sql phase. Otherwise, the data read from the source table can be much
> larger.
> >
> > - Haisheng
> >
> > --
> > 发件人:Julian Hyde
> > 日 期:2019年10月08日 04:53:11
> > 收件人:dev
> > 主 题:Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
> >
> > In
> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946209
> <
> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16946209>
> I floated the idea of a “$HARD_IN” internal function that has the same
> semantics as IN but is not expanded to ‘… = OR … = …’.
> >
> > I think it would be a useful tool, if used judiciously.
> >
> > Julian
> >
> >
> > > On Oct 4, 2019, at 7:08 PM, Haisheng Yuan 
> wrote:
> > >
> > > As a workaround, you can modify you SqlRexConverlet, create a RexCall
> with balanced binary tree, e.g. (a=1 or a=2) or (a=3 or a=4), instead of a
> flat RexCall with multiple operands, e.g. a=1 or a=2 or a=3 or a=4.
> > > Because every OR RexCall has exactly 2 operands, it won't transform
> into SqlCall with left deep tree.
> > >
> > > Let me know it works for you or not.
> > >
> > > - Haisheng
> > >
> > > --
> > > 发件人:Haisheng Yuan
> > > 日 期:2019年10月05日 07:37:04
> > > 收件人:Peter Wicks (pwicks); dev@calcite.apache.org<
> dev@calcite.apache.org>
> > > 主 题:Re: RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
> Logic
> > >
> > > If you want to push the filter down to the source SQL sytem, then
> transforming to a join won't help you either.
> > >
> > > The reason of stackoverflow for large ORs is the left deep binary
> tree, we need to change it to balanced binary tree, to reduce the depth of
> the call.
> > >
> > > I will open a pull request later.
> > >
> > > - Haisheng
> > >
> > > --
> > > 发件人:Peter Wicks (pwicks)
> > > 日 期:2019年10月04日 21:32:25
> > > 收件人:dev@calcite.apache.org
> > > 主 题:RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion
> Logic
> > >
> > > Zoltan,
> > >
> > > Thanks for the suggestion. I actually tried doing a UDF first, and it
> was also successful, sorry for not sharing those details earlier.
> > > The problem with the UDF is that the predicates are not pushed down to
> the source SQL system (by design), and this can result in a 100x increase
> in the amount of data returned from the database. This data will be
> correctly filtered by the UDF, but returning 100x the data makes it a lot
> slower. So I was trying to push it down to the source server instead.
> > >
> > > What do you mean by, "I guess Calcite might probably won't be able to
> do much with these ORs anyway..."? From my experiments I've seen two
> results from passing in this many OR's:
> > >
> > > - If no other predicates are included in the query, then Calcite
> succeeds! It leaves the OR's flat, (a=1 OR a=2 OR a=3 OR a=4)
> > > - If additional predicates are included, then Calcite nests the OR
> statements, leading to a stackoverflow for very large OR's, which is
> CALCITE-2792, a=1) OR a=2) OR a=3) OR a=4)
> > >
> > > Thanks,
> > > Peter
> > >
> > > -Original Message-
> > > From: Zoltan Haindrich 
> > > Sent: Friday, October 4, 2019 12:38 AM
> > > To: dev@calcite.apache.org; Haisheng Yuan ;
> Peter Wicks (pwicks) 
> > > Subject: Re: [EXT] Re: SqlRexConvertlet that Replicates "IN"
> Conversion Logic
> > >
> > >
> > > I think you might try another approach: introduce some UDF and use
> your translation logic to call that - as the UDF will be opaque for calcite
> it will be left alone.
> > > I guess