Re: Different behavior bewteen '>' and '='

2022-04-11 Thread Julian Hyde
I’m confused. Is this thread about type derivation, coercion, simplification, 
or query execution? Those are orthogonal topics, so discussing more than one at 
a time adds confusion. 

> On Apr 11, 2022, at 4:54 AM, Benchao Li  wrote:
> 
> Hi xiong,
> 
> Thanks for the input. Yes, CALCITE-4993 is related to this issue.
> And if we choose #1 or #2, CALCITE-4993 would be solved too.
> 
> xiong duan  于2022年4月10日周日 16:03写道:
> 
>> Hi BenChao,
>>Thanks to bring this up again. I find another relative issue ISSUE-4993
>> <https://issues.apache.org/jira/browse/CALCITE-4993> about the
>> RexSimplify.  Because
>> the EQUALS and NOT-EQUALS Operator uses the *LEAST_RESTRICTIVE* strategy to
>> validate the parameter. Other comparators use the *COMPARE* strategy*.
>> *Please
>> check Jira for details.
>> 
>> Benchao Li  于2022年4月9日周六 20:51写道:
>> 
>>> Hi all,
>>> 
>>> Sorry to bring this up again.
>>> 
>>> I kind of agree with Stamatis. The behavior for '=' is not only different
>>> from '>',
>>> but also different from the CALCITE-2302's implementation.
>>> 
>>> '=' in CALCITE-613 do not add 'cast' operator, and this will complicate
>>> physical implementation.
>>> Even more, Calcite's own enumerable convention cannot handle this.
>>> 
>>> From my perspective, there's something we can do:
>>> #1, remove CALCITE-613's implementation, only keep CALCITE-2302.
>>> #2, keep CALCITE-613, but make it consistent with CALCITE-2302 for '='
>>>  (Also control CALCITE-613 via
>>> SqlValidator#Config#typeCoercionEnabled).
>>> #3, leave it as it is, but we need to fix the enumerable convention for
>>> this case.
>>> 
>>> And the list is also my preference, WDYT?
>>> 
>>> 
>>> Stamatis Zampetakis  于2022年1月13日周四 22:00写道:
>>> 
>>>> Hi all,
>>>> 
>>>> Actually I find it very confusing the fact that > and = behave
>>> differently
>>>> and I would consider this a bug.
>>>> 
>>>> From the SQL standard perspective I don't think this is a valid query
>> and
>>>> as others mentioned it fails in the category of implicit type
>>> conversions.
>>>> My take is that if implicit type conversions are disabled both should
>>> raise
>>>> validation errors.
>>>> 
>>>> From an implementation perspective the
>>>> SqlOperandTypeChecker.Consistency enumeration was added by CALCITE-613
>>> [1]
>>>> to handle some common cases of implicit conversions.
>>>> However, CALCITE-2302 [2] went one step further to deal with many more
>>>> cases of implicit conversions.
>>>> I don't have the full picture in mind but from my perspective the code
>>>> around the Consistency enumeration should be removed/unified with the
>> new
>>>> type conversion APIS.
>>>> 
>>>> Best,
>>>> Stamatis
>>>> 
>>>> [1] https://issues.apache.org/jira/browse/CALCITE-613
>>>> [2] https://issues.apache.org/jira/browse/CALCITE-2302
>>>> 
>>>> 
>>>>> On Thu, Jan 13, 2022 at 2:58 AM Zou Dan  wrote:
>>>>> 
>>>>>> Thank you both for your replies, I will find if there is a better way
>>> to
>>>>> solve my problem.
>>>>> 
>>>>> Best,
>>>>> Dan Zou
>>>>> 
>>>>>> 2022年1月11日 20:33,Vladimir Ozerov  写道:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> If I recall correctly, the SQL standard is mostly silent on how one
>>>>> should
>>>>>> coerce operands. Therefore different systems implement the coercion
>>> at
>>>>>> their discretion. Moreover, the type inference might be influenced
>>> not
>>>>> only
>>>>>> by operands types but by their nature as well. For example, a
>> target
>>>>> system
>>>>>> may be ok with "intCol = '1'", but fail for "intCol = strCol".
>>>>>> 
>>>>>> If you are not satisfied with the default Apache Calcite behavior,
>>> you
>>>>> want
>>>>>> to provide a custom function definition in your own
>> SqlOperatorTable,
>>>>> that
>>>>>> would override functions from the SqlStdOperatorTable. The
>> interfaces
>>&

Re: Vararg/Variadic UDFs or workarounds?

2022-04-06 Thread Julian Hyde
While reading the standard and implementing function resolution for built-in 
and user-defined functions I got the impression that there are very different 
resolution strategies for the two kinds of functions. 

There’s a “Chesterton fence” between the two, and anyone  proposing to remove 
or blur the distinction, or just dig in the vicinity as we are doing here with 
variadic UDFs, needs to demonstrate they understand why it’s there. 

It relates especially to parameter names and overloading:
 * UDFs are resolved by name alone, not by number or types of arguments
 * UDF parameters may have names and default values and arguments may be 
specified in different orders when the function is invoked
 * built in functions do not have named parameters and therefore arguments must 
be specified in a fixed order

See the work I did on made parameters, read the code, the tests. It’s also 
possible that recent standards work (see polymorphic table functions) has 
changed the landscape. 

Julian

> On Apr 6, 2022, at 23:16, Stamatis Zampetakis  wrote:
> 
> Hi Ian,
> 
> From what I recall the work under CALCITE-2772 is an attempt to allow
> introducing vararg UDFs in a more user friendly way.
> 
> Supporting vararg UDFs via Schema and Function interfaces is one way to go
> although without CALCITE-2772 probably this is not possible.
> 
> Another way (and more powerful) would be to use the SqlOperator interface
> (either directly or extend some existing implementation) and create your
> own customised operator. Then you can plug-in your own operator table with
> the custom UDFs in the validator [1].
> Note that currently Calcite has some vararg functions/operators and the
> first that comes to mind is CONCAT [2]. You may check the changes
> introduced by CALCITE-4394 [3] to learn more about this and get inspiration
> if you end-up going down the path of implementing your own SqlOperator.
> 
> Lastly, you could possibly avoid varargs UDF using a small trick that was
> sufficient for me in some use-cases. Make the UDF accept a parameter of
> type ARRAY and call it by wrapping the function arguments in an ARRAY
> constructor. For instance, the queries would look like the following:
> 
> SELECT MY_CUSTOM_UDF(ARRAY['A','B','C'])
> SELECT MY_CUSTOM_UDF(ARRAY['A','C'])
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/zabetak/calcite-tutorial/blob/31cce59c747e0b763a934109db6a6e7055f175ae/solution/src/main/java/com/github/zabetak/calcite/tutorial/LuceneQueryProcessor.java#L166
> [2]
> https://github.com/apache/calcite/blob/a81cfb2ad001589929e190939cf4db928ebac386/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L491
> [3] https://issues.apache.org/jira/browse/CALCITE-4394
> 
>> On Tue, Apr 5, 2022 at 12:15 AM Ian Bertolacci
>>  wrote:
>> 
>> Howdy!
>> We’re trying to add a vararg/variadic UDF, but cannot seem to make it work
>> out.
>> 
>> In our system, we define our UDFs to the Schema’s function multimap, and
>> so have classes with methods which are provided to
>> ScalarFunctionImpl.create(theClass, “methodName” ) to create the Function
>> object, and from which the parser/validator draws the SQL function
>> signature.
>> However, there doesn’t seem to be a way do define such a method where
>> ScalarFunctionImpl infers the variadic function signature, but rather as
>> accepting a single List parameter.
>> 
>> I see that CALCITE-2772 was raised to solve this but the work seems to
>> have stalled in review.
>> 
>> Is there any plan to restart this work?
>> And in the mean time, is there a good way of defining variadic UDFs?
>> 
>> One solution we’ve come up with is to define a function with all (except
>> the first) optional parameters so that it *looks* like a variadic function,
>> but this is limited to 254 parameters by the JVM.
>> While 254 parameters is quite a lot, we predict that our users may exceed
>> this limit.
>> Thanks!
>> -Ian J. Bertolacci
>> 
>> 


Re: Vararg/Variadic UDFs or workarounds?

2022-04-06 Thread Julian Hyde
PS I agree with Stamatis that ARRAY arguments are likely to be an effective 
workaround.

On the other hand I would love it if someone would study this problem and 
propose a solution consistent with the standard. 

Julian

> On Apr 7, 2022, at 06:34, Julian Hyde  wrote:
> 
> While reading the standard and implementing function resolution for built-in 
> and user-defined functions I got the impression that there are very different 
> resolution strategies for the two kinds of functions. 
> 
> There’s a “Chesterton fence” between the two, and anyone  proposing to remove 
> or blur the distinction, or just dig in the vicinity as we are doing here 
> with variadic UDFs, needs to demonstrate they understand why it’s there. 
> 
> It relates especially to parameter names and overloading:
> * UDFs are resolved by name alone, not by number or types of arguments
> * UDF parameters may have names and default values and arguments may be 
> specified in different orders when the function is invoked
> * built in functions do not have named parameters and therefore arguments 
> must be specified in a fixed order
> 
> See the work I did on made parameters, read the code, the tests. It’s also 
> possible that recent standards work (see polymorphic table functions) has 
> changed the landscape. 
> 
> Julian
> 
>> On Apr 6, 2022, at 23:16, Stamatis Zampetakis  wrote:
>> 
>> Hi Ian,
>> 
>> From what I recall the work under CALCITE-2772 is an attempt to allow
>> introducing vararg UDFs in a more user friendly way.
>> 
>> Supporting vararg UDFs via Schema and Function interfaces is one way to go
>> although without CALCITE-2772 probably this is not possible.
>> 
>> Another way (and more powerful) would be to use the SqlOperator interface
>> (either directly or extend some existing implementation) and create your
>> own customised operator. Then you can plug-in your own operator table with
>> the custom UDFs in the validator [1].
>> Note that currently Calcite has some vararg functions/operators and the
>> first that comes to mind is CONCAT [2]. You may check the changes
>> introduced by CALCITE-4394 [3] to learn more about this and get inspiration
>> if you end-up going down the path of implementing your own SqlOperator.
>> 
>> Lastly, you could possibly avoid varargs UDF using a small trick that was
>> sufficient for me in some use-cases. Make the UDF accept a parameter of
>> type ARRAY and call it by wrapping the function arguments in an ARRAY
>> constructor. For instance, the queries would look like the following:
>> 
>> SELECT MY_CUSTOM_UDF(ARRAY['A','B','C'])
>> SELECT MY_CUSTOM_UDF(ARRAY['A','C'])
>> 
>> Best,
>> Stamatis
>> 
>> [1]
>> https://github.com/zabetak/calcite-tutorial/blob/31cce59c747e0b763a934109db6a6e7055f175ae/solution/src/main/java/com/github/zabetak/calcite/tutorial/LuceneQueryProcessor.java#L166
>> [2]
>> https://github.com/apache/calcite/blob/a81cfb2ad001589929e190939cf4db928ebac386/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L491
>> [3] https://issues.apache.org/jira/browse/CALCITE-4394
>> 
>>> On Tue, Apr 5, 2022 at 12:15 AM Ian Bertolacci
>>>  wrote:
>>> 
>>> Howdy!
>>> We’re trying to add a vararg/variadic UDF, but cannot seem to make it work
>>> out.
>>> 
>>> In our system, we define our UDFs to the Schema’s function multimap, and
>>> so have classes with methods which are provided to
>>> ScalarFunctionImpl.create(theClass, “methodName” ) to create the Function
>>> object, and from which the parser/validator draws the SQL function
>>> signature.
>>> However, there doesn’t seem to be a way do define such a method where
>>> ScalarFunctionImpl infers the variadic function signature, but rather as
>>> accepting a single List parameter.
>>> 
>>> I see that CALCITE-2772 was raised to solve this but the work seems to
>>> have stalled in review.
>>> 
>>> Is there any plan to restart this work?
>>> And in the mean time, is there a good way of defining variadic UDFs?
>>> 
>>> One solution we’ve come up with is to define a function with all (except
>>> the first) optional parameters so that it *looks* like a variadic function,
>>> but this is limited to 254 parameters by the JVM.
>>> While 254 parameters is quite a lot, we predict that our users may exceed
>>> this limit.
>>> Thanks!
>>> -Ian J. Bertolacci
>>> 
>>> 


Re: Why RelBuilder.project unwraps SARGs?

2022-04-02 Thread Julian Hyde
Good catch. The unwrapping was not intentional on my part. Removing the 
unwrapping seems like a benefit (less work for RelBuilder, therefore better 
planning performance; SARGs in more places, therefore better optimization; also 
the avoidance of cycles as you mention). 

There’s a small risk that downstream projects are relying on this unwrapping 
but they can control using config.simplify(). We don’t promise not to improve 
our rewrites from one release to the next. 

I think you should fix it. 

Julian

> On Apr 2, 2022, at 8:30 AM, Vladimir Ozerov  wrote:
> 
> Hi,
> 
> When simplification is enabled in the RelBuilder, a call to the
> RelBuilder.project unwraps SARGs into their flat counterparts [1]. This is
> not the case for other nodes, like Filter or Join. This behavior may lead
> to an infinite loop when the ProjectReduceExpressionsRule is used with the
> HepPlanner (which is a common pattern);
> 
>   1. The original Project(e) has the expression e.
>   2. ProjectReduceExpressionsRule simplifies it to SARG(e).
>   3. ProjectReduceExpressionsRule calls RelBuilder.project which returns
>   back Project(e) instead of Project(SARG(e)).
>   4. ProjectReduceExpressionsRule calls Project(e).transformTo(Project(e))
>   which schedules invocation of the rule again, leading to a hang.
> 
> Shall we remove this unwrapping? No tests are affected except for two
> trivial failures in MaterializedViewRelOptRulesTest.
> 
> Regards,
> Vladimir.
> 
> [1]
> https://github.com/apache/calcite/blob/a81cfb2ad001589929e190939cf4db928ebac386/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L1948


Re: [DISCUSS] Github PR link to JIRA issue

2022-04-01 Thread Julian Hyde
The present hyperlinking was enabled by Stamatis: 
https://issues.apache.org/jira/browse/CALCITE-4104 
<https://issues.apache.org/jira/browse/CALCITE-4104> 

I’m not exactly sure how the .asf.yaml file accomplishes this. Maybe it 
instructs some ASF robot to modify the GitHub project settings on our behalf.

More info on the .asf.yaml file: 
https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features 
<https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features> 

Julian


> On Apr 1, 2022, at 9:18 PM, Haisheng Yuan  wrote:
> 
> Yes, that is what I meant.
> 
> I was expecting to see the link to JIRA in PR's commit page:
> https://github.com/apache/calcite/pull/2752/commits
> 
> I am lazy to navigate to JIRA manually. :)
> 
> Haisheng
> 
> On 2022/04/01 01:04:19 Julian Hyde wrote:
>> Ah, I see. In the page for a commit (e.g. [1]) the text “[CALCITE-5064]” 
>> appears as a blue hyperlink whereas the page for the corresponding pull 
>> request (e.g. [2]) the “[CALCITE-5064]” text is not a hyperlink.
>> 
>> Is that what you meant, Haisheng?
>> 
>> Julian
>> 
>> 
>> [1] 
>> https://github.com/apache/calcite/commit/d85b2a602a547290bd5be0bba68092b702400731
>>  
>> <https://github.com/apache/calcite/commit/d85b2a602a547290bd5be0bba68092b702400731>
>> 
>> [2] https://github.com/apache/calcite/pull/2752 
>> <https://github.com/apache/calcite/pull/2752> 
>> 
>>> On Mar 31, 2022, at 11:05 AM, Michael Mior  wrote:
>>> 
>>> Stamatis,
>>> 
>>> Unless I'm misunderstanding, Haisheng was referring to links in the
>>> opposite direction. That is, linking to JIRA issues from GitHub. The
>>> setting you reference is for creating a link from JIRA to GitHub.
>>> 
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> Le jeu. 31 mars 2022 à 13:53, Stamatis Zampetakis  a
>>> écrit :
>>> 
>>>> It's already done via .asf.yaml [1].
>>>> 
>>>> Best,
>>>> Stamatis
>>>> 
>>>> [1]
>>>> 
>>>> https://github.com/apache/calcite/blob/88cc385f98c551c1aca7ffab101934f1c34fdffd/.asf.yaml#L35
>>>> 
>>>> On Thu, Mar 31, 2022 at 7:19 PM Michael Mior  wrote:
>>>> 
>>>>> Not sure what might have changed, but here's the GitHub documentation on
>>>>> the feature. If this isn't working as expected, I would contact INFRA to
>>>>> make sure things are correctly configured. (Apparently in the future,
>>>> this
>>>>> may be done via .asf.yaml)
>>>>> 
>>>>> 
>>>>> 
>>>> https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/configuring-autolinks-to-reference-external-resources
>>>>> 
>>>>> 
>>>>> --
>>>>> Michael Mior
>>>>> mm...@apache.org
>>>>> 
>>>>> 
>>>>> Le mer. 30 mars 2022 à 13:46, Haisheng Yuan  a écrit :
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> Previously, the JIRA issue e.g. [CALCITE-6789] in Calcite github PR and
>>>>>> commit message will be automatically linked to the JIRA site. Now there
>>>>> is
>>>>>> no link anymore.
>>>>>> 
>>>>>> Does anyone know what happened? What can we do to add the link back?
>>>>>> 
>>>>>> Thanks,
>>>>>> Haisheng Yuan
>>>>>> 
>>>>> 
>>>> 
>> 
>> 



Re: Contributing a custom SQL dialect of Firebolt to Calcite

2022-04-01 Thread Julian Hyde
Sorry I was slow to respond to your earlier message.

(No need to add me to the thread - I am on the list, just backlogged. By the 
way, I am on vacation for the next ten days, so don’t expect responses from me. 
Other project members may or may not be able to help.) 

Please log a JIRA case, as I asked. Then we can have discussions in that case.

Re 1. I’m surprised that double-quote doesn’t work in RelToSqlConverterTest. 
Other dialects, such as Postgres, also use double-quote to quote identifiers. 
If you post a PR we could perhaps look at your code.

Re 2. I think you will need to add “FIREBOLT” to the list of allowable values 
for the TEST_DB parameter [1] and also add a value to enum DatabaseInstance [2].

Julian

[1] 
https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java#L166
 
<https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java#L166>
 

[2] 
https://github.com/apache/calcite/blob/bf56743554ea27d250d41db2eb83806f9e626b55/testkit/src/main/java/org/apache/calcite/test/CalciteAssert.java#L1904
 
<https://github.com/apache/calcite/blob/bf56743554ea27d250d41db2eb83806f9e626b55/testkit/src/main/java/org/apache/calcite/test/CalciteAssert.java#L1904>
 

> On Apr 1, 2022, at 5:07 AM, Raghav Sharma  
> wrote:
> 
> Adding Julian to the thread.
> 
> 
> Thanks & Regards,
> Raghav Sharma | SDE 1 | Data Engineering
> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
> <https://www.sigmoid.com/>
> 
> 
> On Thu, Mar 31, 2022 at 2:39 PM Raghav Sharma 
> wrote:
> 
>> Hey Julian!
>> I followed your suggestions related to the dialect/plugin we want to
>> contribute to Calcite for Firebolt. Some of the issues that we faced so far:
>> 
>> 1. Tests are failing in the RelToSqlConverterTest.java file for Firebolt
>> reason being the fact the actual parsed query does not contain the
>> 'Identifier Quote String'. For Firebolt, we've chosen '\"' as the
>> 'Identifier Quote String'. Have followed the approach that other dialects
>> have used and therefore have added support for this in the dialect file as
>> well as in SqlDialect.java and SqlDialectFactoryImpl.java. Have even
>> implemented the method 'withFirebolt()' in the test file. Please guide what
>> else needs to be done for this to be fixed and our test cases to be passed.
>> 
>> 2. After adding 'FIREBOLT' to "TEST_DB' in CalciteSystemProperty.java and
>> running the command: ‘./gradlew -Dcalcite.test.db=FIREBOLT test’,
>> Calcite throws an error that it could not find the enum for 'FIREBOLT' in
>> CalciteAssert.java file. Even after adding the enum either with Firebolt's
>> connection string or using one of the given ones, say POSTGRESQL, it
>> couldn't locate the driver in both cases. Need your suggestion to tackle
>> this one.
>> 
>> 
>> Thanks & Regards,
>> Raghav Sharma | SDE 1 | Data Engineering
>> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
>> <https://www.sigmoid.com/>
>> 
>> 
>> On Tue, Mar 29, 2022 at 1:50 AM Raghav Sharma <
>> ragh...@sigmoidanalytics.com> wrote:
>> 
>>> Thank you for the clarification Julian. I will send a mail to
>>> dev-subscr...@calcite.apache.org for subscribing.
>>> 
>>> 
>>> 
>>> On Tue, 29 Mar 2022 at 1:45 AM Julian Hyde  wrote:
>>> 
>>>> You did receive a response[1]. But you didn't receive it because you
>>>> are not subscribed to this list. Please subscribe[2].
>>>> 
>>>> Julian
>>>> 
>>>> [1] https://lists.apache.org/thread/39obrhwpd95bvhc9sb2n4z6zrrwbc21x
>>>> 
>>>> [2] https://calcite.apache.org/community/#mailing-lists
>>>> 
>>>> On Mon, Mar 28, 2022 at 1:11 PM Raghav Sharma
>>>>  wrote:
>>>>> 
>>>>> Gentle reminder!
>>>>> We are expecting a response so that we can move ahead with this
>>>> potential
>>>>> contribution.
>>>>> 
>>>>> 
>>>>> Thanks & Regards,
>>>>> Raghav Sharma | SDE 1 | Data Engineering
>>>>> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
>>>>> <https://www.sigmoid.com/>
>>>>> 
>>>>> 
>>>>> On Thu, Mar 24, 2022 at 3:47 PM Raghav Sharma <
>>>> ragh...@sigmoidanalytics.com>
>>>>> wrote:
>>>>> 
>>>>>> Greetings from Sigmoid and Firebolt!
>>

Re: Calcite's parser didn't support bitwise's operator

2022-04-01 Thread Julian Hyde
In my opinion we should support the ‘|’ operator but not in the core SQL parser 
by default. Either support it in babel or enable it with flags in the core 
parser.

Rationale: MySQL in its early days seemed to have a wacky mission to make SQL 
look like C. 0 and 1 for true and false, the |, ^, &&, || operators, etc. The 
‘||’ operator is particularly damaging, because it prevents them from using it 
for string concatenation as practically everyone else does. I strongly believe 
that we should not let this wackiness leak into Calcite’s core SQL.

徐仁和, please log a JIRA case to support ‘|’ and post the URL to this thread.

We should definitely add functions for bitwise operations. Because they have 
function syntax they don’t require parser changes, and are therefore easier to 
turn on and off.

Someone should review the PRs for the cases listed. It’s embarrassing that we 
let these languish.

Julian



> On Apr 1, 2022, at 5:50 AM, 徐仁和  wrote:
> 
> Hi all
> I am using CALCITE as a mysql-proxy in my project.
> Some user want use mysql's bit functions[1], but I found CALCITE didn't
> support to parse these token.
> For example:
> 
>> select 1 | 2 as c1;
> 
> 
> Exception:
> Caused by: java.lang.RuntimeException:
> org.apache.calcite.sql.parser.SqlParseException: Encountered "|" at line 1,
> column 10.
> Was expecting one of:
>
>"ORDER" ...
>"LIMIT" ...
>"OFFSET" ...
>"FETCH" ...
>"UNION" ...
>"INTERSECT" ...
> 
> --
> 
> I remember that some jiras[2][3][4][5][6] have been about this problem, and
> how are they doing?
> 
>   1. https://dev.mysql.com/doc/refman/8.0/en/bit-functions.html
>   2. https://issues.apache.org/jira/browse/CALCITE-3782
>   3. https://issues.apache.org/jira/browse/CALCITE-3732
>   4. https://issues.apache.org/jira/browse/CALCITE-3779
>   5. https://issues.apache.org/jira/browse/CALCITE-3592
>   6. https://issues.apache.org/jira/browse/CALCITE-3697
> 
> 
> Thanks & Regards



Re: [DISCUSS] Github PR link to JIRA issue

2022-03-31 Thread Julian Hyde
Ah, I see. In the page for a commit (e.g. [1]) the text “[CALCITE-5064]” 
appears as a blue hyperlink whereas the page for the corresponding pull request 
(e.g. [2]) the “[CALCITE-5064]” text is not a hyperlink.

Is that what you meant, Haisheng?

Julian


[1] 
https://github.com/apache/calcite/commit/d85b2a602a547290bd5be0bba68092b702400731
 


[2] https://github.com/apache/calcite/pull/2752 
 

> On Mar 31, 2022, at 11:05 AM, Michael Mior  wrote:
> 
> Stamatis,
> 
> Unless I'm misunderstanding, Haisheng was referring to links in the
> opposite direction. That is, linking to JIRA issues from GitHub. The
> setting you reference is for creating a link from JIRA to GitHub.
> 
> --
> Michael Mior
> mm...@apache.org
> 
> 
> Le jeu. 31 mars 2022 à 13:53, Stamatis Zampetakis  a
> écrit :
> 
>> It's already done via .asf.yaml [1].
>> 
>> Best,
>> Stamatis
>> 
>> [1]
>> 
>> https://github.com/apache/calcite/blob/88cc385f98c551c1aca7ffab101934f1c34fdffd/.asf.yaml#L35
>> 
>> On Thu, Mar 31, 2022 at 7:19 PM Michael Mior  wrote:
>> 
>>> Not sure what might have changed, but here's the GitHub documentation on
>>> the feature. If this isn't working as expected, I would contact INFRA to
>>> make sure things are correctly configured. (Apparently in the future,
>> this
>>> may be done via .asf.yaml)
>>> 
>>> 
>>> 
>> https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/configuring-autolinks-to-reference-external-resources
>>> 
>>> 
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> Le mer. 30 mars 2022 à 13:46, Haisheng Yuan  a écrit :
>>> 
 Hi all,
 
 Previously, the JIRA issue e.g. [CALCITE-6789] in Calcite github PR and
 commit message will be automatically linked to the JIRA site. Now there
>>> is
 no link anymore.
 
 Does anyone know what happened? What can we do to add the link back?
 
 Thanks,
 Haisheng Yuan
 
>>> 
>> 



Re: [DISCUSS] Best practice for synchronizing master and site branches

2022-03-30 Thread Julian Hyde
I had a quick look and it looks like clean well-thought-out code. I couldn’t 
figure what it was doing at a high level (e.g. what the generated URLs would 
look like), so I think some high-level comments would help.

+1

Thanks for your excellent work, Francis.

Julian


> On Mar 30, 2022, at 11:54 AM, Stamatis Zampetakis  wrote:
> 
> Hi Francis,
> 
> I went over the workflows and rules and everything looks good to me. Great
> work!
> 
> I'm +1 on merging this to master and I am OK with your suggestions about
> Avatica.
> 
> Thanks a lot for moving this forward. It will certainly save us a lot
> of time in the future.
> 
> When this goes in we need to update also our documentation. It will be a
> good opportunity to test that everything is working properly.
> 
> Best,
> Stamatis
> 
> On Wed, Mar 30, 2022, 7:50 AM Francis Chuang 
> wrote:
> 
>> Forgot to mention in my last message, but I am now implementing the
>> automation for calcite-avatica and calcite-avatica-go
>> 
>> For those 2 repos, we never used a site branch as we usually push the
>> site after a release. If there are any small updates to the site that
>> occur after the release, we just built from master and pushed it as
>> there is usually no unreleased updates to the docs due to avatica not
>> having much updates. This is the same situation for avatica-go.
>> 
>> Therefore, for calcite-avatica and calcite-avatica-go, I plan to:
>> - Always build from master if there's an update to site.
>> - For a release, build from master and build the javadocs and publish.
>> 
>> I think this should we sufficient for our use-case for now and should
>> improve the release process and site publishing process significantly.
>> If we find edge cases in the future, we can deal with those at a later
>> time.
>> 
>> Please let me know what you guys think.
>> 
>> Francis
>> 
>> On 30/03/2022 4:21 am, Julian Hyde wrote:
>>> I have never needed or wanted a versioned Javadoc URL for Calcite. Our
>> APIs tend to grow over time.
>>> 
>>> The only requirement I see is that we don’t pollute the javadoc/doc of
>> the latest released version with things that are not yet released. Which
>> would lead to two versions: latest release and head.
>>> 
>>> I can see that the implementation might be simpler if we have multiple
>> versions, but let’s be clear that that is not the requirement.
>>> 
>>> Julian
>>> 
>>> 
>>>> On Mar 29, 2022, at 6:49 AM, Fan Liya  wrote:
>>>> 
>>>> I think it is a good idea to provide versioned JavaDocs.
>>>> 
>>>> However, even if we only provide the JavaDoc of the latest release,
>>>> there is no need to maintain two branches (IMHO),
>>>> because the processes of updating the website and JavaDoc are
>>>> relatively separate processes (according to [1]).
>>>> With a single branch, it is feasible to update the website regularly,
>>>> and update JavaDocs only at release times.
>>>> 
>>>> Best,
>>>> Liya Fan
>>>> 
>>>> [1]
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>> 
>>>> Alessandro Solimando  于2022年3月29日周二
>> 17:59写道:
>>>>> 
>>>>> Hello everyone,
>>>>> I totally agree on automating the website publication and having a
>> single
>>>>> branch, the less we do manually, the lower the chances to mess
>> something up.
>>>>> 
>>>>> I am also in favour of versioned docs in the website, it's confusing to
>>>>> land on updated pages from an older context like a message from the ML.
>>>>> 
>>>>> Best regards,
>>>>> Alessandro
>>>>> 
>>>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang >> 
>>>>> wrote:
>>>>> 
>>>>>> Hey Julian,
>>>>>> 
>>>>>> All very good points. I can definitely see the utility of the
>> javadocs.
>>>>>> The analogue in Go would be godoc, with the difference being that the
>>>>>> godoc server automatically crawls the code across all versions to
>>>>>> generate the documentation.
>>>>>> 
>>>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>>>> selector on the top left to look at the documentation for different
>>>>>> versions of the mod

Re: [DISCUSS] Github PR link to JIRA issue

2022-03-30 Thread Julian Hyde
It seems to happen sporadically. Maybe the commit message of the PR needs to 
start with exactly “[CALCITE-]”.

> On Mar 30, 2022, at 10:45 AM, Haisheng Yuan  wrote:
> 
> Hi all,
> 
> Previously, the JIRA issue e.g. [CALCITE-6789] in Calcite github PR and
> commit message will be automatically linked to the JIRA site. Now there is
> no link anymore.
> 
> Does anyone know what happened? What can we do to add the link back?
> 
> Thanks,
> Haisheng Yuan



Re: Calcite jars with shaded guava

2022-03-30 Thread Julian Hyde
Wow, I had no idea that anyone was on a version of Guava that old.

I checked our history of Guava versions:
 * In https://issues.apache.org/jira/browse/CALCITE-1590 
 (Calcite 1.12) we changed 
the default from 19 to 20, but still supported Guava 14.0.1 and higher.
 * In https://issues.apache.org/jira/browse/CALCITE-1715 
 (also Calcite 1.12) we 
changed the default version from 20 to 19
 * In https://issues.apache.org/jira/browse/CALCITE-2191 
 (Calcite 1.16) we moved 
the minimum from 14.0.1 to 19 and cited a Druid email thread [1] about 
upgrading Druid’s version of Guava

Foolish of me to assume that 4 years later Druid would have done something.

Julian

[1] https://groups.google.com/g/druid-development/c/Dw2Qu1CWbuQ

> On Mar 30, 2022, at 10:22 AM, Abhishek Agarwal 
>  wrote:
> 
> Thank you, Alessandro and Stamatis, for the info. Currently, the guava
> version being used in druid is 16. So simple exclusion is unlikely to work.
> I will still give it a try. If it doesn't, we will shade the jars on our
> end.
> 
> On Tue, Mar 29, 2022 at 3:38 PM Alessandro Solimando <
> alessandro.solima...@gmail.com> wrote:
> 
>> Hello Abhishek,
>> maybe I am re-stating what Stamatis suggested without realising, but since
>> Calcite works well with a broad range of guava versions, can't you simply
>> exclude guava when you include it in druid?
>> 
>> At that point if the guava version of druid is in the list of those
>> supported by Calcite (>= 19 IIRC) all should work fine.
>> 
>> I feel like Stamatis concerning shading.
>> 
>> Best regards,
>> Alessandro
>> 
>> On Tue, 29 Mar 2022 at 11:18, Stamatis Zampetakis 
>> wrote:
>> 
>>> Hi Abhishek,
>>> 
>>> Calcite supports multiple versions of Guava so in principle it should
>> work
>>> without problems with older versions.
>>> Instead of shading you could possibly select explicitly the version
>>> that you want to use via maven/gradle.
>>> 
>>> I share your pain but shading is quite complex to get right so I am not
>> in
>>> favor of putting this burden on Calcite.
>>> 
>>> I've seen it used in many projects and it rarely works as expected. Most
>> of
>>> the time it creates additional problems that need to be resolved.
>>> 
>>> Best,
>>> Stamatis
>>> 
>> 



Re: [DISCUSS] Best practice for synchronizing master and site branches

2022-03-29 Thread Julian Hyde
I have never needed or wanted a versioned Javadoc URL for Calcite. Our APIs 
tend to grow over time.

The only requirement I see is that we don’t pollute the javadoc/doc of the 
latest released version with things that are not yet released. Which would lead 
to two versions: latest release and head.

I can see that the implementation might be simpler if we have multiple 
versions, but let’s be clear that that is not the requirement.

Julian


> On Mar 29, 2022, at 6:49 AM, Fan Liya  wrote:
> 
> I think it is a good idea to provide versioned JavaDocs.
> 
> However, even if we only provide the JavaDoc of the latest release,
> there is no need to maintain two branches (IMHO),
> because the processes of updating the website and JavaDoc are
> relatively separate processes (according to [1]).
> With a single branch, it is feasible to update the website regularly,
> and update JavaDocs only at release times.
> 
> Best,
> Liya Fan
> 
> [1] 
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> 
> Alessandro Solimando  于2022年3月29日周二 17:59写道:
>> 
>> Hello everyone,
>> I totally agree on automating the website publication and having a single
>> branch, the less we do manually, the lower the chances to mess something up.
>> 
>> I am also in favour of versioned docs in the website, it's confusing to
>> land on updated pages from an older context like a message from the ML.
>> 
>> Best regards,
>> Alessandro
>> 
>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang 
>> wrote:
>> 
>>> Hey Julian,
>>> 
>>> All very good points. I can definitely see the utility of the javadocs.
>>> The analogue in Go would be godoc, with the difference being that the
>>> godoc server automatically crawls the code across all versions to
>>> generate the documentation.
>>> 
>>> As an example, see the godoc for protobuf [1]. There is a version
>>> selector on the top left to look at the documentation for different
>>> versions of the module / library in question.
>>> 
>>> You mentioned that you do not want to have a version string in the URL.
>>> Is there any particular reason for this? For example, if I were to end
>>> up on the mailing list archives through a google search and there's a
>>> message linking to the javadoc, it might be more helpful if the javadoc
>>> was linked to a particular version of the release so that the context
>>> around the discussion at the time makes more sense.
>>> 
>>> We can have all javadocs for all releases of Calcite published and have
>>> a selector to jump between versions, similar to godoc, for example, like
>>> this javadoc for google cloud with a version selector on the bottom
>>> right [2]. This would allow users to switch between different versions
>>> and look at the version of the javadoc that's currently being used in
>>> their project.
>>> 
>>> Regarding the documentation on the website itself, would it make sense
>>> if we have a versioned copy for each release? Currently, we only publish
>>> the documentation for the latest release, so, if we were to look at
>>> older messages from the mailing list and follow a link to the
>>> documentation, the documentation could be incorrect or not relevant to
>>> the message itself.
>>> 
>>> Maybe we can have a folder for each release? For example:
>>> -
>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>> -
>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>> 
>>> This would give each release their own documentation with a unique path.
>>> For the current unreleased version, we can still put it in version of
>>> the next release:
>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>> and
>>> maybe have a message that says this is an unreleased version like
>>> elasticsearch [3]. Links to this release's javadoc would work before and
>>> after the release and would never break.
>>> 
>>> The upside to this approach is that all documentation (even the
>>> unreleased version) is published immediately, but they are versioned, so
>>> there is no confusion. It also means that users of Calcite master would
>>> be able to look at the docs online. This also simplifies the deployment
>>> of site as we no longer need the site branch: the website can just be
>>> built from master.
>>> 
>>> Francis
>>> 
>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>> 



Re: Exception parsing "SELECT @@character_set_server"

2022-03-28 Thread Julian Hyde
Thanks, Justin. For now, it’s enough to know that they have types.

If we hit the requirement that Calcite’s validator needs to know that (say) 
there is a variable called ‘FOO’ and its type is ’TIMESTAMP’ then we can use a 
plug-in map or even the SqlOperatorTable (modeling variables as functions with 
no arguments, kind of like CURRENT_TIMESTAMP today).   

> On Mar 28, 2022, at 1:52 PM, Justin Swanhart  wrote:
> 
> Hi,
> 
> They are strongly typed. Many are strings that can only be set to certain
> legal values (like optmizer_switch). These variables can also be used in
> MySQL SET statements. Note that there is no way to easily type the
> variables except by examining their values, in the performance_schema (at
> least in 8.0). MySQL has some special values like ON
> (set @@session.unique_checks=ON). 1 and 0 are supported in place of ON or
> OFF.
> 
> There are 652 global variables in MySQL 8.0.27 enumerated in the attached
> fiddle. Each MySQL version adds (and sometimes removes) global variables
> so this is probably hard to maintain on the Calcite side except to maybe
> have some generic support for a MySQL global variable.
> 
> [1]
> https://dbfiddle.uk/?rdbms=mysql_8.0=639e08374d2218ab85e16d8210538d88 
> <https://dbfiddle.uk/?rdbms=mysql_8.0=639e08374d2218ab85e16d8210538d88>
> 
> On Mon, Mar 28, 2022 at 3:08 PM Julian Hyde  <mailto:jhyde.apa...@gmail.com>> wrote:
> 
>> A couple more questions. Do these variables have well-defined types? Do
>> they have to be valid expressions? If the answer to either of these
>> questions is ’no’ then maybe they are what Oracle calls ’substitution
>> variables’. Substitution variables have a behavior more like C macros than
>> real variables and we would not be able to handle them easily.
>> 
>> Julian
>> 
>> [1]
>> https://blogs.oracle.com/opal/post/sql*plus-substitution-variables-define-variables-and-parameters-in-sql-queries#2
>> <
>> https://blogs.oracle.com/opal/post/sql*plus-substitution-variables-define-variables-and-parameters-in-sql-queries#2
>>  
>> <https://blogs.oracle.com/opal/post/sql*plus-substitution-variables-define-variables-and-parameters-in-sql-queries#2>>
>> 
>> 
>>> On Mar 28, 2022, at 12:02 PM, Julian Hyde 
>> wrote:
>>> 
>>> We’d be open to adding support, but it must not be the default behavior,
>> so there will be need to be some kind of flag.
>>> 
>>> Can you log a JIRA case with the subject ’Support variables with “@" and
>> “@@" prefixes (like MySQL)’. Then we can write a specification and you can
>> submit a pull request.
>>> 
>>> At the RexNode level I think there are already constructs for
>> referencing variables to this change would be confined to the parser.
>>> 
>>> Julian
>>> 
>>> 
>>>> On Mar 28, 2022, at 4:04 AM, Adolfo Ochagavía > <mailto:ado...@ochagavia.nl <mailto:ado...@ochagavia.nl>>> wrote:
>>>> 
>>>> Would you be open to a patch to add proper support for this kind of
>> MySQL variables, or do you consider it to be outside the scope of Calcite?
>>>> 
>>>> On 2022/03/22 21:52:09 Justin Swanhart wrote:
>>>>> MySQL support two categories of variables, user variables which are
>>>>> prefixed with the @character and session/global SERVER variables which
>> are
>>>>> prefixed with @@.
>>>>> 
>>>>> You can also access them via:
>>>>> 
>>>>> Select @@session.session_var;
>>>>> Select @@global.global_var;
>>>>> select @@session_or_global_var;
>>>>> 
>>>>> for example:
>>>>> mysql> select @@warp_adjust_table_stats_for_joins;
>>>>> +-+
>>>>> | @@warp_adjust_table_stats_for_joins |
>>>>> +-+
>>>>> | 1 |
>>>>> +-+
>>>>> 1 row in set (0.00 sec)
>>>>> 
>>>>> mysql> set warp_adjust_table_stats_for_joins= false;
>>>>> Query OK, 0 rows affected (0.00 sec)
>>>>> 
>>>>> mysql> select @@warp_adjust_table_stats_for_joins;
>>>>> +-+
>>>>> | @@warp_adjust_table_stats_for_joins |
>>>>> +-----+
>>>>> | 0 |
>>>>> +-+
>>>>> 1 row in set (0.00 sec)
>>>>> 
>>>>&g

Re: [DISCUSS] Best practice for synchronizing master and site branches

2022-03-28 Thread Julian Hyde
>>>> This would negate the need to build and publish the site manually and
>>>>>>> simplify the process as we always only commit to master. As an added
>>>>>>> bonus, we if we keep the site branch, but automate the process, maybe
>>>> we
>>>>>>> can lock the site branch so that only CI can push to it. The downside
>>>> of
>>>>>>> course, is that we're relying on heuristics for the partial build, so
>>>>>>> there's some "magic" to it.
>>>>>>> 
>>>>>>> Francis
>>>>>>> 
>>>>>>> 
>>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-3129
>>>>>>> 
>>>>>>> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> Thanks for starting this discussion Liya. It is important to find
>>>> which
>>>>>>>> parts of the process are unclear and improve them if possible.
>>>>>>>> 
>>>>>>>> The current procedure for updating the website remains unchanged and
>>>> it
>>>>>>> is
>>>>>>>> documented here:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>>>>>> 
>>>>>>>> If the procedure is not followed, which has happened a few times in
>>>> the
>>>>>>>> past, meaning that someone commits directly in site without
>>>> committing
>>>>> in
>>>>>>>> master then we will have commits in site that may get lost forever.
>>>>>>>> When we discover such commits we should port them to master. The
>>>>>>>> cherry-pick now goes in the opposite direction (from site to master).
>>>>>>>> This is usually discovered/done by the release manager and that's why
>>>>> we
>>>>>>>> have the respective instructions in the howto [1].
>>>>>>>> 
>>>>>>>> After a release we don't care much what happens because master and
>>>> site
>>>>>>>> should be equal. As Francis pointed out this is usually done with a
>>>>> force
>>>>>>>> push.
>>>>>>>> 
>>>>>>>> Regarding Julian's question the commit hashes before the force pushes
>>>>>>> done
>>>>>>>> by Liya are the following (according to commits@calcite):
>>>>>>>> * master -> dcbc493bf699d961427952c5efc047b76d859096
>>>>>>>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Stamatis
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
>>>>>>>> 
>>>>>>>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde 
>>>> wrote:
>>>>>>>> 
>>>>>>>>> Does anyone know (or could find out) the SHA of the master and site
>>>>>>>>> branches at the time that Fan attempted to move the site changes
>>>> over?
>>>>>>>>> If so, we could recreate the same environment, and figure out a set
>>>> of
>>>>>>>>> git commands that would have worked then and will work for the next
>>>>>>>>> release manager. This process is safe because we can do these
>>>>>>>>> experiments in a local git sandbox, without pushing to any remote.
>>>>>>>>> 
>>>>>>>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya 
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Francis,
>>>>>>>>>> 
>>>>>>>>>> Thanks for your feedback.
>>>>>>>>>> 
>>>>>>>>>> It seems we should choose option 2.
>>>>>>>>>> In addition, it seems less risky to run "git push --force" commands
>>>>> in
>>>>>>>>>> the site branch.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Liya Fan
>>>>>>>>>> 
>>>>>>>>>> Francis Chuang  于2022年3月25日周五 12:14写道:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Liya,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for bringing this up. We have always done the following
>>>> when
>>>>>>>>>>> committing:
>>>>>>>>>>> 1. Always commit to master.
>>>>>>>>>>> 2. If we need to publish the change to the site now (for example,
>>>>> new
>>>>>>>>>>> committer or announcement), cherry-pick the change into the site
>>>>>>> branch
>>>>>>>>>>> and publish it.
>>>>>>>>>>> 3. After a release, make the site branch the same as master (git
>>>>> reset
>>>>>>>>>>> --hard master) and force push (git push --force origin site).
>>>>>>>>>>> 
>>>>>>>>>>> Francis
>>>>>>>>>>> 
>>>>>>>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> 
>>>>>>>>>>>> As part of the release process, we need to synchronize the master
>>>>> and
>>>>>>>>>>>> site branches (Please see
>>>>>>>>>>>> 
>>>>>>>>> 
>>>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
>>>>> ).
>>>>>>>>>>>> Usually, the site is behind the master branch by some commits.
>>>>>>>>>>>> If the existing commits in the site branch are in the same order
>>>> as
>>>>>>>>> in
>>>>>>>>>>>> the master branch, the task is easy: just switch to the site
>>>>> branch,
>>>>>>>>>>>> and run
>>>>>>>>>>>> 
>>>>>>>>>>>> git rebase master
>>>>>>>>>>>> 
>>>>>>>>>>>> However, if some commits are in different orders, it can be
>>>> tricky.
>>>>>>>>>>>> For example, the master branch may have the following commits (in
>>>>>>>>>>>> order):
>>>>>>>>>>>> 
>>>>>>>>>>>> A, B, X1, X2, ... , Xn.
>>>>>>>>>>>> 
>>>>>>>>>>>> and the site branch may have the following commits (in order):
>>>>>>>>>>>> 
>>>>>>>>>>>> B, A, X1, X2.
>>>>>>>>>>>> 
>>>>>>>>>>>> Basically we have two choices:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. We can live with the out of order commits, because after
>>>>>>>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
>>>>> file
>>>>>>>>>>>> contents will be consistent.
>>>>>>>>>>>> 
>>>>>>>>>>>> The problem is that, since the two branches have diverged, we
>>>>> cannot
>>>>>>>>>>>> use the rebase command. Instead, we have to manually cherry-pick
>>>>>>>>>>>> commits individually, which requires large effort. In addition,
>>>> for
>>>>>>>>>>>> any subsequent release processes, we have to manually cherry-pick
>>>>>>>>> each
>>>>>>>>>>>> commit.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. We need to make the commits order consistent, which will make
>>>> it
>>>>>>>>>>>> easy for subsequent releases.
>>>>>>>>>>>> However, the problem is that, to make the commits order
>>>> consistent,
>>>>>>>>>>>> some git force push command is unavoidable, which is risky to
>>>> some
>>>>>>>>>>>> extent.
>>>>>>>>>>>> 
>>>>>>>>>>>> So what is the recommended way to do this? Thanks in advance for
>>>>>>>>> your feedback!
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Liya Fan
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 



Re: Contributing a custom SQL dialect of Firebolt to Calcite

2022-03-28 Thread Julian Hyde
You did receive a response[1]. But you didn't receive it because you
are not subscribed to this list. Please subscribe[2].

Julian

[1] https://lists.apache.org/thread/39obrhwpd95bvhc9sb2n4z6zrrwbc21x

[2] https://calcite.apache.org/community/#mailing-lists

On Mon, Mar 28, 2022 at 1:11 PM Raghav Sharma
 wrote:
>
> Gentle reminder!
> We are expecting a response so that we can move ahead with this potential
> contribution.
>
>
> Thanks & Regards,
> Raghav Sharma | SDE 1 | Data Engineering
> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
> 
>
>
> On Thu, Mar 24, 2022 at 3:47 PM Raghav Sharma 
> wrote:
>
> > Greetings from Sigmoid and Firebolt!
> >
> >
> > This mail is in reference towards contributing a custom sql dialect to
> > Calcite.
> >
> >
> > PFB some queries that we have regarding testing the dialect and further
> > for contributing:
> >
> >
> > 1. We have developed a SQL dialect for Firebolt. Need to test it against
> > the database using calcite. How can we do so? Is this something that can be
> > pulled off using the JDBC driver(if so, how?) or is there another way?
> >
> >
> > 2. There are some test files that contain tests for specific dialects.
> > Before contributing, should we alter those for Firebolt(if required) or is
> > it something that will be taken care of by the Calcite team?
> >
> >
> > 3. Is opening a JIRA case mandatory? Can we raise a PR directly to add
> > Firebolt to the master branch of Calcite? Please guide us if there’s an
> > alternate option.
> >
> >
> > We are willing to connect over a call as well if that works for you.
> >
> >
> >
> > Thanks & Regards,
> > Raghav Sharma | SDE 1 | Data Engineering
> > M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
> > 
> >
> --
> Thanks & Regards,
> Raghav Sharma | SDE 1 | Data Engineering
> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
> 
>
> --
>
>


Re: Exception parsing "SELECT @@character_set_server"

2022-03-28 Thread Julian Hyde
We’d be open to adding support, but it must not be the default behavior, so 
there will be need to be some kind of flag.

Can you log a JIRA case with the subject ’Support variables with “@" and “@@" 
prefixes (like MySQL)’. Then we can write a specification and you can submit a 
pull request.

At the RexNode level I think there are already constructs for referencing 
variables to this change would be confined to the parser.  

Julian


> On Mar 28, 2022, at 4:04 AM, Adolfo Ochagavía  wrote:
> 
> Would you be open to a patch to add proper support for this kind of MySQL 
> variables, or do you consider it to be outside the scope of Calcite?
> 
> On 2022/03/22 21:52:09 Justin Swanhart wrote:
>> MySQL support two categories of variables, user variables which are
>> prefixed with the @character and session/global SERVER variables which are
>> prefixed with @@.
>> 
>> You can also access them via:
>> 
>> Select @@session.session_var;
>> Select @@global.global_var;
>> select @@session_or_global_var;
>> 
>> for example:
>> mysql> select @@warp_adjust_table_stats_for_joins;
>> +-+
>> | @@warp_adjust_table_stats_for_joins |
>> +-+
>> | 1 |
>> +-+
>> 1 row in set (0.00 sec)
>> 
>> mysql> set warp_adjust_table_stats_for_joins= false;
>> Query OK, 0 rows affected (0.00 sec)
>> 
>> mysql> select @@warp_adjust_table_stats_for_joins;
>> +-+
>> | @@warp_adjust_table_stats_for_joins |
>> +-+
>> | 0 |
>> +-+
>> 1 row in set (0.00 sec)
>> 
>> mysql> select @@session.warp_adjust_table_stats_for_joins;
>> +-+
>> | @@session.warp_adjust_table_stats_for_joins |
>> +-+
>> | 0 |
>> +-+
>> 1 row in set (0.00 sec)
>> 
>> mysql> select @@global.warp_adjust_table_stats_for_joins;
>> +----+
>> | @@global.warp_adjust_table_stats_for_joins |
>> ++
>> | 1 |
>> ++
>> 1 row in set (0.00 sec)
>> 
>> 
>> On Tue, Mar 22, 2022 at 5:02 PM Julian Hyde > <http://gmail.com/>> wrote:
>> 
>>> The ‘@@‘ prefix is not standard SQL, and Calcite does not support it.
>>> 
>>> Can you do some research to find out how MySQL handles it. Is it
>>> considered to be part of the variable name? Or is it a prefix (like $ in
>>> bash) that means ‘what comes next is a variable’? In other words, does the
>>> parser say there is a reference to a variable called
>>> '@@character_set_server’ or a variable called ‘character_set_server’? And
>>> is ‘@‘ a legal part of a variable name?
>>> 
>>> Also, is it handled by the core SQL parser or by a preprocessor?
>>> 
>>> Julian
>>> 
>>> 
>>>> On Mar 22, 2022, at 2:17 AM, Adolfo Ochagavía >>> <http://ochagavia.nl/>>
>>> wrote:
>>>> 
>>>> Hi there,
>>>> 
>>>> I am writing a MySQL-compatible server that talks the MySQL protocol.
>>> Some clients are sending special queries to autoconfigure themselves, like
>>> "SELECT @@character_set_server". I would like to use calcite to parse such
>>> queries, but parsing fails with an exception, seemingly related to the
>>> usage of "@@" in variable names. Is this unsupported or am I doing
>>> something wrong?
>>>> 
>>>> The code:
>>>>> var config = SqlParser.Config.DEFAULT.withLex(Lex.MYSQL);
>>>>> var parser = SqlParser.create("SELECT @@character_set_server", config);
>>>>> var parsed = parser.parseQuery();
>>>> 
>>>> The exception: org.apache.calcite.sql.parser.SqlParseException: Lexical
>>> error at line 1, column 9. Encountered: "@" (64), after : ""
>>>> 
>>>> Any help is appreciated!
>>>> Adolfo



Re: Exception parsing "SELECT @@character_set_server"

2022-03-28 Thread Julian Hyde
A couple more questions. Do these variables have well-defined types? Do they 
have to be valid expressions? If the answer to either of these questions is 
’no’ then maybe they are what Oracle calls ’substitution variables’. 
Substitution variables have a behavior more like C macros than real variables 
and we would not be able to handle them easily.

Julian

[1] 
https://blogs.oracle.com/opal/post/sql*plus-substitution-variables-define-variables-and-parameters-in-sql-queries#2
 
<https://blogs.oracle.com/opal/post/sql*plus-substitution-variables-define-variables-and-parameters-in-sql-queries#2>
 

> On Mar 28, 2022, at 12:02 PM, Julian Hyde  wrote:
> 
> We’d be open to adding support, but it must not be the default behavior, so 
> there will be need to be some kind of flag.
> 
> Can you log a JIRA case with the subject ’Support variables with “@" and “@@" 
> prefixes (like MySQL)’. Then we can write a specification and you can submit 
> a pull request.
> 
> At the RexNode level I think there are already constructs for referencing 
> variables to this change would be confined to the parser.  
> 
> Julian
> 
> 
>> On Mar 28, 2022, at 4:04 AM, Adolfo Ochagavía > <mailto:ado...@ochagavia.nl>> wrote:
>> 
>> Would you be open to a patch to add proper support for this kind of MySQL 
>> variables, or do you consider it to be outside the scope of Calcite?
>> 
>> On 2022/03/22 21:52:09 Justin Swanhart wrote:
>>> MySQL support two categories of variables, user variables which are
>>> prefixed with the @character and session/global SERVER variables which are
>>> prefixed with @@.
>>> 
>>> You can also access them via:
>>> 
>>> Select @@session.session_var;
>>> Select @@global.global_var;
>>> select @@session_or_global_var;
>>> 
>>> for example:
>>> mysql> select @@warp_adjust_table_stats_for_joins;
>>> +-+
>>> | @@warp_adjust_table_stats_for_joins |
>>> +-+
>>> | 1 |
>>> +-+
>>> 1 row in set (0.00 sec)
>>> 
>>> mysql> set warp_adjust_table_stats_for_joins= false;
>>> Query OK, 0 rows affected (0.00 sec)
>>> 
>>> mysql> select @@warp_adjust_table_stats_for_joins;
>>> +-+
>>> | @@warp_adjust_table_stats_for_joins |
>>> +-+
>>> | 0 |
>>> +-+
>>> 1 row in set (0.00 sec)
>>> 
>>> mysql> select @@session.warp_adjust_table_stats_for_joins;
>>> +-+
>>> | @@session.warp_adjust_table_stats_for_joins |
>>> +-+
>>> | 0 |
>>> +-+
>>> 1 row in set (0.00 sec)
>>> 
>>> mysql> select @@global.warp_adjust_table_stats_for_joins;
>>> ++
>>> | @@global.warp_adjust_table_stats_for_joins |
>>> ++
>>> | 1 |
>>> ++
>>> 1 row in set (0.00 sec)
>>> 
>>> 
>>> On Tue, Mar 22, 2022 at 5:02 PM Julian Hyde >> <http://gmail.com/>> wrote:
>>> 
>>>> The ‘@@‘ prefix is not standard SQL, and Calcite does not support it.
>>>> 
>>>> Can you do some research to find out how MySQL handles it. Is it
>>>> considered to be part of the variable name? Or is it a prefix (like $ in
>>>> bash) that means ‘what comes next is a variable’? In other words, does the
>>>> parser say there is a reference to a variable called
>>>> '@@character_set_server’ or a variable called ‘character_set_server’? And
>>>> is ‘@‘ a legal part of a variable name?
>>>> 
>>>> Also, is it handled by the core SQL parser or by a preprocessor?
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>>> On Mar 22, 2022, at 2:17 AM, Adolfo Ochagavía >>>> <http://ochagavia.nl/>>
>>>> wrote:
>>>>> 
>>>>> Hi there,
>>>>> 
>>>>> I am writing a MySQL-compatible server that talks the MySQL protocol.
>>>> Some clients are sending special queries to autoconfigure themselves, like
>>>> "SELECT @@character_set_server". I would like to use calcite to parse such
>>>> queries, but parsing fails with an exception, seemingly related to the
>>>> usage of "@@" in variable names. Is this unsupported or am I doing
>>>> something wrong?
>>>>> 
>>>>> The code:
>>>>>> var config = SqlParser.Config.DEFAULT.withLex(Lex.MYSQL);
>>>>>> var parser = SqlParser.create("SELECT @@character_set_server", config);
>>>>>> var parsed = parser.parseQuery();
>>>>> 
>>>>> The exception: org.apache.calcite.sql.parser.SqlParseException: Lexical
>>>> error at line 1, column 9. Encountered: "@" (64), after : ""
>>>>> 
>>>>> Any help is appreciated!
>>>>> Adolfo
> 



Re: JDBC join order question

2022-03-28 Thread Julian Hyde
Xiaoying, I know why you didn’t receive the email. Your reply went through 
moderation, which indicates that you are not subscribed to dev@calcite. Please 
subscribe.

> On Mar 28, 2022, at 11:10 AM, Xiaoying Wang  
> wrote:
> 
> Hi Benchao,
> 
> Thank you so much for the reply! I don't know why I didn't receive the
> email but I found your answer at the archive website.
> 
> I tried to add the `JoinConditionPushRule` as you suggested but the
> filter cannot be pushed from the second join to the first, and the
> code throws the same error.
> 
> It is because the initial state of my query is `(R join1 S) join2 T`,
> and the conditions pushed to join2 by `FilterIntoJoinRule` are `R.key
> = T.key and S.key = T.key`. Neither `R.key = T.key` nor `S.key =
> T.key` can be pushed into join1 because they both require `T.key`. So
> both `leftBitmap.contains(inputBits)` and
> `rightBitmap.contains(inputBits)` at [1] are false and the condition
> cannot be pushed down into the first join.
> 
> Do you think I need to add other rules to solve the problem? (I tried
> to add several rules to alter the join order including
> `JoinAssociateRule`, `JoinCommuteRule`,
> `JoinPushThroughJoinRule.RIGHT/LEFT` but none of them were fired.)
> 
> Best,
> 
> Xiaoying
> 
> [1] 
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptUtil.java#L2838
> 
> 
> On 2022/03/27 00:41:57 Benchao Li wrote:
>> Hi Xiaoying,
>> 
>> You can try to add `JoinConditionPushRule`[1] to solve the problem.
>> 
>> According the rules you provide, you only have `FilterIntoJoinRule`,
>> hence the filters will only be push into the second Join.
>> The exception you see indicates that the first Join has no condition,
>> and it cannot be handled by ENUMERABLE convention as you can see here[2]
>> 
>> [1]
>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L281
>> [2]
>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcRules.java#L304
>> 
>> 
>> Xiaoying Wang  于2022年3月26日周六 04:46写道:
>> 
>>> Hi all,
>>> 
>>> 
>>> I have a query joins three tables (star schema, T is the fact
>>> table): "select * from R, S, T where R.key = T.key and S.key = T.key" and
>>> want to use calcite to parse and generate the physical plan.
>>> 
>>> 
>>> When I pass the query to a volcano planner, it throws the
>>> CannotPlanException:
>>> 
>>> 
>>> Exception in thread "main"
>>> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not
>>> enough rules to produce a node with desired properties:
>>> convention=ENUMERABLE.
>>> 
>>> Missing conversions are LogicalJoin[convention: NONE -> JDBC.DB1] (2 cases)
>>> 
>>> There are 2 empty subsets:
>>> Empty subset 0: rel#34:RelSubset#2.JDBC.DB1, the relevant part of the
>>> original plan is as follows
>>> 14:LogicalJoin(condition=[true], joinType=[inner])
>>>  1:JdbcTableScan(subset=[rel#12:RelSubset#0.JDBC.DB1], table=[[DB1, R]])
>>>  3:JdbcTableScan(subset=[rel#13:RelSubset#1.JDBC.DB1], table=[[DB1, S]])
>>> 
>>> Empty subset 1: rel#36:RelSubset#4.JDBC.DB1, the relevant part of the
>>> original plan is as follows
>>> 17:LogicalJoin(condition=[true], joinType=[inner])
>>>  14:LogicalJoin(subset=[rel#15:RelSubset#2.NONE], condition=[true],
>>> joinType=[inner])
>>>1:JdbcTableScan(subset=[rel#12:RelSubset#0.JDBC.DB1], table=[[DB1, R]])
>>>3:JdbcTableScan(subset=[rel#13:RelSubset#1.JDBC.DB1], table=[[DB1, S]])
>>>  7:JdbcTableScan(subset=[rel#16:RelSubset#3.JDBC.DB1], table=[[DB1, T]])
>>> 
>>> 
>>> 
>>> I tried to rewrite the query, and found that as long as T is not the last
>>> relation (e.g. "select * from T, R, S where R.key = T.key and S.key =
>>> T.key"), there is no error. I think it means that the join order in the
>>> initial logical plan makes the difference, so I tried to add the
>>> JoinAssociateRule. However, I found this rule was not applied by the
>>> optimizer because of the convocation not match. Using the above example,
>>> for 17:LogicalJoin the convocation of 14:LogicalJoin and 7:JdbcTableScan
>>> are not the same ( at here:
>>> 
>>> https://github.com/apache/calcite/blob/e42b85a45bd16dd58db1546736e653deda5463fe/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoRuleCall.java#L358
>>> ).
>>> 
>>> 
>>> These are the rules in the planner:
>>> 
>>> [FilterIntoJoinRule, JoinAssociateRule, JoinCommuteRule,
>>> JdbcToEnumerableConverterRule(in:JDBC.DB1,out:ENUMERABLE),
>>> JdbcJoinRule(in:NONE,out:JDBC.DB1), JdbcProjectRule(in:NONE,out:JDBC.DB1),
>>> JdbcFilterRule(in:NONE,out:JDBC.DB1),
>>> JdbcAggregateRule(in:NONE,out:JDBC.DB1),
>>> JdbcSortRule(in:NONE,out:JDBC.DB1), JdbcUnionRule(in:NONE,out:JDBC.DB1),
>>> JdbcIntersectRule(in:NONE,out:JDBC.DB1),
>>> JdbcMinusRule(in:NONE,out:JDBC.DB1),
>>> JdbcTableModificationRule(in:NONE,out:JDBC.DB1),
>>> JdbcValuesRule(in:NONE,out:JDBC.DB1), FilterSetOpTransposeRule,
>>> ProjectRemoveRule]
>>> 
>>> 
>>> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

2022-03-25 Thread Julian Hyde
Does anyone know (or could find out) the SHA of the master and site
branches at the time that Fan attempted to move the site changes over?
If so, we could recreate the same environment, and figure out a set of
git commands that would have worked then and will work for the next
release manager. This process is safe because we can do these
experiments in a local git sandbox, without pushing to any remote.

On Fri, Mar 25, 2022 at 6:09 AM Fan Liya  wrote:
>
> Hi Francis,
>
> Thanks for your feedback.
>
> It seems we should choose option 2.
> In addition, it seems less risky to run "git push --force" commands in
> the site branch.
>
> Best,
> Liya Fan
>
> Francis Chuang  于2022年3月25日周五 12:14写道:
> >
> > Hi Liya,
> >
> > Thanks for bringing this up. We have always done the following when
> > committing:
> > 1. Always commit to master.
> > 2. If we need to publish the change to the site now (for example, new
> > committer or announcement), cherry-pick the change into the site branch
> > and publish it.
> > 3. After a release, make the site branch the same as master (git reset
> > --hard master) and force push (git push --force origin site).
> >
> > Francis
> >
> > On 25/03/2022 3:03 pm, Fan Liya wrote:
> > > Hi all,
> > >
> > > As part of the release process, we need to synchronize the master and
> > > site branches (Please see
> > > https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> > > Usually, the site is behind the master branch by some commits.
> > > If the existing commits in the site branch are in the same order as in
> > > the master branch, the task is easy: just switch to the site branch,
> > > and run
> > >
> > > git rebase master
> > >
> > > However, if some commits are in different orders, it can be tricky.
> > > For example, the master branch may have the following commits (in
> > > order):
> > >
> > > A, B, X1, X2, ... , Xn.
> > >
> > > and the site branch may have the following commits (in order):
> > >
> > > B, A, X1, X2.
> > >
> > > Basically we have two choices:
> > >
> > > 1. We can live with the out of order commits, because after
> > > cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> > > contents will be consistent.
> > >
> > > The problem is that, since the two branches have diverged, we cannot
> > > use the rebase command. Instead, we have to manually cherry-pick
> > > commits individually, which requires large effort. In addition, for
> > > any subsequent release processes, we have to manually cherry-pick each
> > > commit.
> > >
> > > 2. We need to make the commits order consistent, which will make it
> > > easy for subsequent releases.
> > > However, the problem is that, to make the commits order consistent,
> > > some git force push command is unavoidable, which is risky to some
> > > extent.
> > >
> > > So what is the recommended way to do this? Thanks in advance for your 
> > > feedback!
> > >
> > > Best,
> > > Liya Fan


Re: [VOTE] Release apache-calcite-avatica-go-5.1.0 (release candidate 0)

2022-03-24 Thread Julian Hyde
+1

Checked signatures and checksums, LICENSE, NOTICE, README.md. Checked that 
tar.gz contains same files as git.

Note #1. I ran

  go get github.com/apache/calcite-avatica-go 


and got the error

  go get github.com/apache/calcite-avatica-go: no matching versions for query 
"upgrade"

I don’t know whether that is expected behavior.

Note #2. When I checked signatures I received the following output:

  gpg: Signature made Wed 23 Mar 2022 07:42:45 PM PDT
  gpg:using RSA key 635665E0BE3F72552910CB74BBE44E923A970AB7
  gpg: Good signature from "Francis Chuang " [unknown]
  gpg: WARNING: This key is not certified with a trusted signature!
  gpg:  There is no indication that the signature belongs to the owner.
  Primary key fingerprint: 6356 65E0 BE3F 7255 2910  CB74 BBE4 4E92 3A97 0AB7

Francis, Did you perhaps use a different signing key than the one in KEYS?

Julian
 

> On Mar 24, 2022, at 3:54 AM, Stamatis Zampetakis  wrote:
> 
> Ubuntu 20.04.4 LTS, docker-compose version 1.25.5, build 8a1c60f6
> 
> * Checked signatures and checksums OK
> * Went over release note OK
> * Built from git tag and run tests (docker-compose run test) OK
> * Built from source artifacts and run tests OK
> * Checked diff between repo and release sources OK
> * Checked LICENSE, NOTICE, README.md OK
> 
> +1 (binding)
> 
> Best,
> Stamatis
> 
> 
> On Thu, Mar 24, 2022 at 3:45 AM Francis Chuang 
> wrote:
> 
>> Hi all,
>> 
>> I have created a release for Apache Calcite Avatica Go 5.1.0, release
>> candidate 0.
>> 
>> Thanks to everyone who has contributed to this release. The release
>> notes are available here:
>> 
>> https://github.com/apache/calcite-avatica-go/blob/c232d7bafd26d4798a2e7fc8335eff1259c0178d/site/_docs/go_history.md
>> 
>> The commit to be voted on:
>> 
>> https://gitbox.apache.org/repos/asf?p=calcite-avatica-go.git;a=commit;h=c232d7bafd26d4798a2e7fc8335eff1259c0178d
>> 
>> The hash is c232d7bafd26d4798a2e7fc8335eff1259c0178d
>> 
>> The artifacts to be voted on are located here:
>> 
>> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-avatica-go-5.1.0-rc0/
>> 
>> The hashes of the artifacts are as follows:
>> src.tar.gz 5BCA4C08 1C5AD9AD 633355C8 582D7347 602E7088 08B1A87C
>> C868BA8B 98C6B4FB ED88A6D9 B111BFB9 3F3A0E45 FDEF7E03 E797E02E FD7CFBFC
>> F3CE0EA8 18C00BAA
>> 
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/francischuang.asc
>> 
>> Instructions for running the test suite is located here:
>> 
>> https://github.com/apache/calcite-avatica-go/blob/c232d7bafd26d4798a2e7fc8335eff1259c0178d/site/develop/avatica-go.md#testing
>> 
>> Please vote on releasing this package as Apache Calcite Avatica Go 5.1.0.
>> 
>> To run the tests without a Go environment, install docker and
>> docker-compose. Then, in the root of the release's directory, run:
>> docker-compose run test
>> 
>> When the test suite completes, run "docker-compose down" to remove and
>> shutdown all the containers.
>> 
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Calcite Avatica Go 5.1.0
>> [ ]  0 I don't feel strongly about it, but I'm okay with the release
>> [ ] -1 Do not release this package because...
>> 
>> 
>> Here is my vote:
>> 
>> +1 (binding)
>> 
>> Francis
>> 



Re: Contributing a custom SQL dialect of Firebolt to Calcite

2022-03-24 Thread Julian Hyde
Raghav,

Thanks for considering making a contribution. We should be able to resolve most 
questions on this list (if you haven’t already, please subscribe, so that your 
emails don’t need to go through moderation).

Yes, we require a JIRA case for contributions except for cosmetic changes (e.g. 
fixing typos). Create a PR and reference it in the JIRA case.

By a ‘dialect’, I assume you mean a plugin so that Calcite’s JDBC adapter can 
generate SQL that Firebolt is able to parse. If so, the best path is to follow 
the template for other recently added dialects, e.g. Exasol [1], Clickhouse 
[2]. Take a look at the JIRA case and also the files changed in the attached PR.

You should add test cases to RelToSqlConverterTest for any area of syntax where 
Firebolt’s SQL differs significantly from existing dialects. But for most 
dialects there aren’t many of those areas.

We don’t require that the tests run against a live instance of Firebolt. It 
would be a good idea for you to try to run Calcite’s test suite against 
Firebolt. I would add a value “FIREBOLT” to TEST_DB in 
CalciteSystemProperty.java, run something like ‘./gradlew 
-Dcalcite.test.db=FIREBOLT test’, and then fix whatever is broken.

Julian

[1] https://issues.apache.org/jira/browse/CALCITE-4614 
 

[2]  https://issues.apache.org/jira/browse/CALCITE-2157 
 

> On Mar 24, 2022, at 3:17 AM, Raghav Sharma  
> wrote:
> 
> Greetings from Sigmoid and Firebolt!
> 
> 
> This mail is in reference towards contributing a custom sql dialect to
> Calcite.
> 
> 
> PFB some queries that we have regarding testing the dialect and further for
> contributing:
> 
> 
> 1. We have developed a SQL dialect for Firebolt. Need to test it against
> the database using calcite. How can we do so? Is this something that can be
> pulled off using the JDBC driver(if so, how?) or is there another way?
> 
> 
> 2. There are some test files that contain tests for specific dialects.
> Before contributing, should we alter those for Firebolt(if required) or is
> it something that will be taken care of by the Calcite team?
> 
> 
> 3. Is opening a JIRA case mandatory? Can we raise a PR directly to add
> Firebolt to the master branch of Calcite? Please guide us if there’s an
> alternate option.
> 
> 
> We are willing to connect over a call as well if that works for you.
> 
> 
> 
> Thanks & Regards,
> Raghav Sharma | SDE 1 | Data Engineering
> M: +91 7087637086 I Mail: ragh...@sigmoidanalytics.com
> 
> 
> -- 
> 
> 



Re: Calcite SQL parser error

2022-03-23 Thread Julian Hyde
Already logged as https://issues.apache.org/jira/browse/CALCITE-4401 
. Contributions welcome.


> On Mar 22, 2022, at 4:47 PM, Nikhil Goyal  wrote:
> 
> Hi folks,
> I am running into this error:
> 
> Exception in thread "main" java.lang.RuntimeException: No list started
> at
> org.apache.calcite.sql.pretty.SqlPrettyWriter.sep(SqlPrettyWriter.java:1079)
> at
> org.apache.calcite.sql.pretty.SqlPrettyWriter.sep(SqlPrettyWriter.java:1074)
> at org.apache.calcite.sql.SqlJoin$SqlJoinOperator.unparse(SqlJoin.java:214)
> at org.apache.calcite.sql.SqlDialect.unparseCall(SqlDialect.java:470)
> at org.apache.calcite.sql.SqlCall.unparse(SqlCall.java:104)
> at org.apache.calcite.sql.SqlNode.toSqlString(SqlNode.java:151)
> at org.apache.calcite.sql.SqlNode.toString(SqlNode.java:126)
> at java.lang.String.valueOf(String.java:2994)
> at java.io.PrintStream.println(PrintStream.java:821)
> at scala.Console$.println(Console.scala:271)
> at scala.Predef$.println(Predef.scala:404)
> at com.cdl.common.sparksql.SQLtoDF$.main(SQLtoDF.scala:53)
> at com.cdl.common.sparksql.SQLtoDF.main(SQLtoDF.scala)
> 
> This is the code:
> 
>val query =
>  "SELECT e.first_name AS FirstName, s.salary AS Salary from employee
> AS e join salary AS s on e.emp_id=s.emp_id where e.organization = 'Tesla'
> and s.organization = 'Tesla'"
> 
>val config = SqlParser.configBuilder.setCaseSensitive(false).build
> 
>val parser = SqlParser.create(query, config)
>val node = parser.parseQuery()
>val select = node.asInstanceOf[SqlSelect]
>println(select.getFrom)
> 
> Any idea what is going on?
> 
> Thanks
> Nikhil



Re: Exception parsing "SELECT language"

2022-03-23 Thread Julian Hyde
SELECT CURRENT_TIMESTAMP

Is a synactically and semantically valid query (because CURRENT_TIMESTAMP 
happens to be a built-in function that doesn’t take parentheses or arguments). 
On the other hand

  SELECT x

Is syntactically valid but semantically invalid (because there is no ‘x’ in 
global scope).  Lastly,

  SELECT language

Is syntactically invalid (and gives a different error to ’SELECT x’). The 
following patch demonstrates:

diff --git 
a/testkit/src/main/java/org/apache/calcite/sql/parser/SqlParserTest.java 
b/testkit/src/main/java/org/apache/calcite/sql/parser/SqlParserTest.java
index 0dd79138ea..e45809105a 100644
--- a/testkit/src/main/java/org/apache/calcite/sql/parser/SqlParserTest.java
+++ b/testkit/src/main/java/org/apache/calcite/sql/parser/SqlParserTest.java
@@ -8769,6 +8769,14 @@ private static Consumer> 
checkWarnings(
 sql(sql).ok(expected);
   }
 
+  @Test void testLanguage() {
+final String sql = "select x";
+sql(sql).ok("SELECT `X`");
+
+final String sql2 = "select language";
+sql(sql2).ok("SELECT `X`");
+  }
+
   @Test void testMatchRecognizePatternSkip5() {
 final String sql = "select *\n"
 + "  from t match_recognize\n"




> On Mar 23, 2022, at 11:20 AM, Michael Mior  wrote:
> 
> I was hinting that there might be a syntax issue caused by a missing part
> of the query. I would not have assumed "SELECT foobar" is a valid query
> both in syntax and semantics.
> --
> Michael Mior
> mm...@apache.org
> 
> 
> Le mer. 23 mars 2022 à 14:12, Julian Hyde  a écrit :
> 
>> If there’s nowhere for LANGUAGE to come from that would make it
>> semantically invalid but wouldn’t affect the syntactic validity. (In other
>> words, you’d get an error from SqlValidator but not from SqlParser.)
>> 
>> In this case, the problem is that LANGUAGE is a reserved keyword in both
>> standard SQL and Calcite.
>> 
>> Julian
>> 
>> 
>>> On Mar 23, 2022, at 11:05 AM, Michael Mior  wrote:
>>> 
>>> What would you expect this query to return? You haven't specified a FROM
>>> clause, so there's no indication where "language" should come from.
>>> 
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> Le mer. 23 mars 2022 à 14:02, Adolfo Ochagavía  a
>>> écrit :
>>> 
>>>> Hi there,
>>>> 
>>>> I am trying to find my way using Calcite to parse SQL queries, and was
>>>> surprised to find out that parsing the query "SELECT language" fails
>> with
>>>> an exception.
>>>> 
>>>> This is the code:
>>>>> var config = SqlParser.Config.DEFAULT.withLex(Lex.MYSQL);
>>>>> var parser = SqlParser.create("SELECT language", config);
>>>>> var parsed = parser.parseQuery();
>>>> 
>>>> This is the exception:
>>>>> org.apache.calcite.sql.parser.SqlParseException: Encountered ".
>>>> language" at line 1, column 18.
>>>>> Was expecting one of:
>>>>>   
>>>>>   "AS" ...
>>>>>   [the rest is omitted for brevity, but about 60 more lines follow]
>>>> 
>>>> Am I missing something or is this a bug? Note that the query is a
>>>> simplified excerpt of an autoconfiguration query issued by MySQL's JDBC
>>>> driver and seems to be handled well by MySQL servers. Below I am pasting
>>>> the full query, in case someone would like to see the original:
>>>>> /* mysql-connector-java-8.0.19 (Revision:
>>>> a0ca826f5cdf51a98356fdfb1bf251eb042f80bf) */SELECT
>>>> @@session.auto_increment_increment AS auto_increment_increment,
>>>> @@character_set_client AS character_set_client,
>>>> @@character_set_connection AS character_set_connection,
>>>> @@character_set_results AS character_set_results,
>>>> @@character_set_server AS character_set_server, @@collation_server AS
>>>> collation_server, @@collation_connection AS collation_connection,
>>>> @@init_connect AS init_connect, @@interactive_timeout AS
>>>> interactive_timeout, @@language AS language, @@license AS license,
>>>> @@lower_case_table_names AS lower_case_table_names,
>>>> @@max_allowed_packet AS max_allowed_packet, @@net_write_timeout AS
>>>> net_write_timeout, @@performance_schema AS performance_schema,
>>>> @@query_cache_size AS query_cache_size, @@query_cache_type AS
>>>> query_cache_type, @@sql_mode AS sql_mode, @@system_time_zone AS
>>>> system_time_zone, @@time_zone AS time_zone, @@tx_isolation AS
>>>> transaction_isolation, @@wait_timeout AS wait_timeout
>>>> 
>>>> Thanks for helping out ;)
>>>> Adolfo
>> 
>> 



Re: Towards Avatica-Go 5.1.0

2022-03-23 Thread Julian Hyde
Thank you, Francis!

> On Mar 22, 2022, at 10:10 PM, Francis Chuang  wrote:
> 
> Hey everyone,
> 
> It's been quite a while since the last Avatica-Go release, more than 2 years!
> 
> I have been doing some housekeeping and upgrading dependencies, etc. I think 
> it's probably a good time to release 5.1.0.
> 
> I believe I have completed most of the housekeeping. I plan to make rc0 
> available for voting in the next few days or early next week, time permitting.
> 
> Francis



Re: Exception parsing "SELECT language"

2022-03-23 Thread Julian Hyde
If there’s nowhere for LANGUAGE to come from that would make it semantically 
invalid but wouldn’t affect the syntactic validity. (In other words, you’d get 
an error from SqlValidator but not from SqlParser.)

In this case, the problem is that LANGUAGE is a reserved keyword in both 
standard SQL and Calcite.

Julian


> On Mar 23, 2022, at 11:05 AM, Michael Mior  wrote:
> 
> What would you expect this query to return? You haven't specified a FROM
> clause, so there's no indication where "language" should come from.
> 
> --
> Michael Mior
> mm...@apache.org
> 
> 
> Le mer. 23 mars 2022 à 14:02, Adolfo Ochagavía  a
> écrit :
> 
>> Hi there,
>> 
>> I am trying to find my way using Calcite to parse SQL queries, and was
>> surprised to find out that parsing the query "SELECT language" fails with
>> an exception.
>> 
>> This is the code:
>>> var config = SqlParser.Config.DEFAULT.withLex(Lex.MYSQL);
>>> var parser = SqlParser.create("SELECT language", config);
>>> var parsed = parser.parseQuery();
>> 
>> This is the exception:
>>> org.apache.calcite.sql.parser.SqlParseException: Encountered ".
>> language" at line 1, column 18.
>>> Was expecting one of:
>>>
>>>"AS" ...
>>>[the rest is omitted for brevity, but about 60 more lines follow]
>> 
>> Am I missing something or is this a bug? Note that the query is a
>> simplified excerpt of an autoconfiguration query issued by MySQL's JDBC
>> driver and seems to be handled well by MySQL servers. Below I am pasting
>> the full query, in case someone would like to see the original:
>>> /* mysql-connector-java-8.0.19 (Revision:
>> a0ca826f5cdf51a98356fdfb1bf251eb042f80bf) */SELECT
>> @@session.auto_increment_increment AS auto_increment_increment,
>> @@character_set_client AS character_set_client,
>> @@character_set_connection AS character_set_connection,
>> @@character_set_results AS character_set_results,
>> @@character_set_server AS character_set_server, @@collation_server AS
>> collation_server, @@collation_connection AS collation_connection,
>> @@init_connect AS init_connect, @@interactive_timeout AS
>> interactive_timeout, @@language AS language, @@license AS license,
>> @@lower_case_table_names AS lower_case_table_names,
>> @@max_allowed_packet AS max_allowed_packet, @@net_write_timeout AS
>> net_write_timeout, @@performance_schema AS performance_schema,
>> @@query_cache_size AS query_cache_size, @@query_cache_type AS
>> query_cache_type, @@sql_mode AS sql_mode, @@system_time_zone AS
>> system_time_zone, @@time_zone AS time_zone, @@tx_isolation AS
>> transaction_isolation, @@wait_timeout AS wait_timeout
>> 
>> Thanks for helping out ;)
>> Adolfo



Re: Exception parsing "SELECT @@character_set_server"

2022-03-22 Thread Julian Hyde
The ‘@@‘ prefix is not standard SQL, and Calcite does not support it.

Can you do some research to find out how MySQL handles it. Is it considered to 
be part of the variable name? Or is it a prefix (like $ in bash) that means 
‘what comes next is a variable’? In other words, does the parser say there is a 
reference to a variable called '@@character_set_server’ or a variable called 
‘character_set_server’? And is ‘@‘ a legal part of a variable name?

Also, is it handled by the core SQL parser or by a preprocessor?

Julian


> On Mar 22, 2022, at 2:17 AM, Adolfo Ochagavía  wrote:
> 
> Hi there,
> 
> I am writing a MySQL-compatible server that talks the MySQL protocol. Some 
> clients are sending special queries to autoconfigure themselves, like "SELECT 
> @@character_set_server". I would like to use calcite to parse such queries, 
> but parsing fails with an exception, seemingly related to the usage of "@@" 
> in variable names. Is this unsupported or am I doing something wrong?
> 
> The code:
>> var config = SqlParser.Config.DEFAULT.withLex(Lex.MYSQL);
>> var parser = SqlParser.create("SELECT @@character_set_server", config);
>> var parsed = parser.parseQuery();
> 
> The exception: org.apache.calcite.sql.parser.SqlParseException: Lexical error 
> at line 1, column 9.  Encountered: "@" (64), after : ""
> 
> Any help is appreciated!
> Adolfo



Re: [VOTE] Release Apache Calcite 1.30.0 (release candidate 3)

2022-03-18 Thread Julian Hyde
+1

Checked checksums and signatures, compared with git repo at same sha, checked 
LICENSE, NOTICE, README, howto.md, release notes, compiled and ran tests on 
Ubuntu 20.04.4 LTS Linux 5.4.0-104-generic x86_64 using JDK 11 and Gradle 7.2 
(noting that the recommended Gradle version is 7.3), ran RAT.

Notes:
 * Liya’s key needs to be signed by one or more of us before the release

Julian


> On Mar 18, 2022, at 8:48 AM, Michael Mior  wrote:
> 
> +1 (binding)
> 
> Checked hash and compiled and ran tests. Thanks for being RM!
> 
> --
> Michael Mior
> mm...@apache.org
> 
> 
> Le mar. 15 mars 2022 à 23:36, Fan Liya  a écrit :
> 
>> Hi all,
>> 
>> I have created a build for Apache Calcite 1.30.0, release
>> candidate 3.
>> 
>> Thanks to everyone who has contributed to this release.
>> 
>> You can read the release notes here:
>> 
>> https://github.com/apache/calcite/blob/calcite-1.30.0-rc3/site/_docs/history.md
>> 
>> The commit to be voted upon:
>> 
>> https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=cd5229b9a371e20de207231d55bcbddf1ff39ec2
>> 
>> Its hash is cd5229b9a371e20de207231d55bcbddf1ff39ec2
>> 
>> Tag:
>> https://github.com/apache/calcite/tree/calcite-1.30.0-rc3
>> 
>> The artifacts to be voted on are located here:
>> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.30.0-rc3
>> (revision 53122)
>> 
>> The hashes of the artifacts are as follows:
>> 
>> 6c9f65bcc75a05c226dbb0a74495e76761d738f6923086e2943fface1b8864697dff6d0de7aa0cb9b06f4f6aebe1322b5ecb829428f84d08c51dfb5773034040
>> *apache-calcite-1.30.0-src.tar.gz
>> 
>> A staged Maven repository is available for review at:
>> 
>> https://repository.apache.org/content/repositories/orgapachecalcite-1153/org/apache/calcite/
>> 
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/COMMITTER_ID.asc
>> https://www.apache.org/dist/calcite/KEYS
>> 
>> To create the jars and test Apache Calcite: "gradle build"
>> (requires an appropriate Gradle/JDK installation)
>> 
>> Please vote on releasing this package as Apache Calcite 1.30.0.
>> 
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Calcite 1.30.0
>> [ ]  0 I don't feel strongly about it, but I'm okay with the release
>> [ ] -1 Do not release this package because...
>> 
>> Here is my vote:
>> 
>> +1 (non-binding)
>> 
>> Best,
>> Liya Fan
>> 



Re: Context-aware/stateful SQL operator functions? (IE adding a stack for number of times an operator has been called in query)

2022-03-18 Thread Julian Hyde
I think you’re proposing making the JSON_ functions smarter at runtime. My 
general philosophy is to have the smarts at prepare time and make the runtime 
operators dumb. I think that philosophy can be applied here. Some extra logic 
would kick in when preparing a query that has JSON_ functions, and that logic 
would fold together multiple nested JSON_ functions. Perhaps we need to add a 
new (internal) JSON_ function that can do all of the steps.

Julian


> On Mar 18, 2022, at 8:50 AM, Gavin Ray  wrote:
> 
> Sorry to beat a dead horse here, but I'm one of those weirdos that gets a
> lot of use out of Calcite's JSON operators.
> Calcite's JSON implementation is broken for queries that have more than one
> depth of JSON object/array calls.
> 
> The reason is because the operator calls "jsonize()", which parses the
> (presumably) JVM object as a JSON string
> This works for something like Map, but if you have
> Map>, what happens is this:
> 
> JSON_OBJECT(
>  foo: 1,
>  bar: JSON_OBJECT(
> qux: 2
>  )
> )
> 
> The parse happens inside-out, so first we get the innermost object parsed,
> which gives:
> 
> { "qux": 2 }
> 
> But -- as a string! This is important!
> Now, when we parse the next object, we get this:
> 
> { "foo": 1, "bar": \"{ \"qux\": 2 }\" }
> 
> This is because the object with "qux" isn't an object, but a string value
> Which to be valid JSON, needs to have its quotes and braces escaped
> 
> Definitely not what you want, and the value isn't usable =(
> 
> Since there is no state/context/stack (that I can tell) when the parse
> function is called,
> how might it be possible to write something to the effect of:
> 
> "Analyze the query, and if the number of JSON operations is greater than
> one,
> only call 'jsonize()' on the outer-most parse/object."
> 
> https://github.com/apache/calcite/blob/8d21c3f2f0b75d788e70bbeea9746695f2fde552/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1886-L1891
> https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/main/java/org/apache/calcite/runtime/JsonFunctions.java#L93-L95



Re: Setting the "defaultSchema" of a live CalciteConnection?

2022-03-15 Thread Julian Hyde
An ancestor project of Calcite, Eigenbase (aka Farrago) used to have a SET 
SCHEMA command. It would be nice if we added that back. (Maybe only in the 
“server” module, since in “core” the connections are stateless.)

I’m a bit surprised that "CalciteConnection.setSchema(String)" doesn’t work. 
Put a breakpoint in the constructor of CalciteCatalogReader and see what is the 
value of schemaPaths. That is effectively the default schema for purposes of 
validating SQL statements.

Julian







> On Mar 15, 2022, at 11:49 AM, Gavin Ray  wrote:
> 
> I have a scenario in which I have a rootSchema that holds multiple
> datasource schemas
> There's a tool I'd like to integrate it with, but it can't handle  (catalog
> -> schema -> schema) structure
> 
> I figured an easy way around this would be to iterate the names of the
> datasource schemas, and then
> and then make a clone of the current CalciteConnection with the
> defaultSchema set to 
> 
> I can't seem to figure out how to do this, though.
> 
> Calling .setSchema() doesn't seem to work (probably no impact on the JDBC
> metadata)
> and if I create a new connection with the "schema" property set, then it's
> a fresh/empty Calcite source.
> 
> Is there any way to do this? Or some craft hack?
> 
> Thank you =)



Re: [RESULT] [VOTE] Release apache-calcite-1.30.0 (release candidate 2)

2022-03-15 Thread Julian Hyde
Liya,

The PR has a couple of LGTMs, and it has passed my internal tests. I think you 
should cherry-pick it into master, revise release notes, and start a new RC. 
(Will require a small force-push.)

Julian


On 2022/03/15 05:50:30 Fan Liya wrote:
> Hi Julian,
> 
> Thanks for your feedback and your PR.
> 
> Best,
> Liya Fan
> 
> Julian Hyde  于2022年3月15日周二 12:42写道:
> 
> > I have a PR for [CALCITE-5040]. Please review
> > https://github.com/apache/calcite/pull/2744 <
> > https://github.com/apache/calcite/pull/2744>.
> >
> > Liya, Sorry I didn’t find time to vote on this RC. I reviewed the release
> > notes and they look good.
> >
> > > On Mar 14, 2022, at 8:24 PM, Fan Liya  wrote:
> > >
> > > Thanks to everyone who has tested the release candidate and given
> > > their comments and votes.
> > >
> > > The tally is as follows.
> > >
> > > 2 binding -1s:
> > > Ruben,
> > > Francis
> > >
> > > 2 non-binding +1s:
> > > Liya,
> > > Enrico
> > >
> > > 1 binding 0:
> > > Stamatis
> > >
> > > Therefore, I think this release candidate should be rejected, until
> > > CALCITE-5040 is resolved.
> > >
> > > Later, I will revert the release note commit
> > > (170035fd97df1afdd0c0a499f63e0ca1606f7837,
> > > maybe through git push force to master), and prepare another one after
> > the
> > > patch for
> > > CALCITE-5040 is merged (if any).
> > >
> > > Thanks everyone.
> > >
> > > Liya Fan
> >
> >
> 


Re: [RESULT] [VOTE] Release apache-calcite-1.30.0 (release candidate 2)

2022-03-14 Thread Julian Hyde
I have a PR for [CALCITE-5040]. Please review 
https://github.com/apache/calcite/pull/2744 
.

Liya, Sorry I didn’t find time to vote on this RC. I reviewed the release notes 
and they look good.

> On Mar 14, 2022, at 8:24 PM, Fan Liya  wrote:
> 
> Thanks to everyone who has tested the release candidate and given
> their comments and votes.
> 
> The tally is as follows.
> 
> 2 binding -1s:
> Ruben,
> Francis
> 
> 2 non-binding +1s:
> Liya,
> Enrico
> 
> 1 binding 0:
> Stamatis
> 
> Therefore, I think this release candidate should be rejected, until
> CALCITE-5040 is resolved.
> 
> Later, I will revert the release note commit
> (170035fd97df1afdd0c0a499f63e0ca1606f7837,
> maybe through git push force to master), and prepare another one after the
> patch for
> CALCITE-5040 is merged (if any).
> 
> Thanks everyone.
> 
> Liya Fan



Re: applying the wrong rule for calculated columns

2022-03-14 Thread Julian Hyde
I thought for a minute this case was talking about computed columns. So let me 
clarify terminology.

These aren’t "calculated columns" in the sense of ColumnStrategy [1], which has 
values STORED (column’s value is computed using an expression and stored) and 
VIRTUAL (column's value is not stored, but is computed when it is used). 

See https://issues.apache.org/jira/browse/CALCITE-707 
 for more details.

Julian

[1] 
https://calcite.apache.org/javadocAggregate/org/apache/calcite/schema/ColumnStrategy.html

> On Mar 13, 2022, at 7:21 PM, xiaobo  wrote:
> 
> @Override
> 
>public void onMatch(RelOptRuleCall call) {
> 
> LogicalProject project = call.rel(0);
> 
> JsonTableScan scan = call.rel(1);
> 
> Integer[] fields = 
> getProjectFields(project.getProjects());
> 
> if(fields == null || fields.length == 0)
> 
>   return ;
> 
> for(int i = 0; i < fields.length; i++) {
> 
>   if(fields[i] == null)
> 
>   return;
> 
> }
> 
> 
> 
> 
> call.transformTo(
> 
>  new JsonTableScan(scan.getCluster(), 
> scan.getTable(), fields)
> 
> );
> 
>}
> 
> 
> 
> 
> the onMatch method should check fields data, if they are not correct, 
> then should not call call.transformTo.
> 
> 
> 
> 
> test1_2 now passes.
> 
> 
> 
> 
> 
> --Original--
> From: 
>"xiaobo "  
>   
>  Date:Mon, Mar 14, 2022 10:03 AM
> To:"dev" 
> Subject:re:applying the wrong rule for calculated columns
> 
> 
> 
> after we change the String[] fields to Integer[] fields , the problem still 
> occurs.
> 
> 
> 
> our runnable test is at 
> https://github.com/guxiaobo/calcite-json-adapter/blob/main/src/test/java/org/apache/calcite/adapter/json/test/JsonSchemaTest.java
> 
> ---
> @Test
> public void test1_2()throws SQLException{
> Map Schema schema = new JsonSchemalt;gt;("js", map);
> CalciteConnection conn = this.openConn1(schema, "js");
> System.out.println("c1 + c11 = " + exeGetLong(conn, "select c1 + c11 from 
> js.t1"));
> 
> conn.close();
> 
> }
> ---
> 
> java.sql.SQLException: Error while executing SQL "select c1 + c11 from 
> js.t1": Error while applying rule JsonProjectTableScanRule, args 
> [rel#11:LogicalProject.NONE.[](input=RelSubset#10,exprs=[+($9, $3)]), 
> rel#1:JsonTableScan.ENUMERABLE.[](table=[js, t1])]
> at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
> at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
> at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
> at 
> org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
> at org.apache.calcite.adapter.json.test.BaseTest.exeGetLong(BaseTest.java:37)
> at 
> org.apache.calcite.adapter.json.test.JsonSchemaTest.test1_2(JsonSchemaTest.java:76)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
> at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
> at 
> 

Re: Broken history after force pushing to master

2022-03-09 Thread Julian Hyde
Yes, the ‘git rebase’ command works in mysterious ways. If it sees commits that 
have the same content it will ignore them, even if they have different hashes. 
Since you want to preserve the hashes in the master branch, the direction that 
you rebase (e.g. from master to site, versus from site to master) is crucial.

As RM, you occasionally have to force-push. But force-pushes can be scary. It 
would be nice if there were some guidelines for when force-push  is necessary, 
and when it is not going to do much damage (e.g. is only going to change 1 
commit). 

Julian


> On Mar 9, 2022, at 4:19 PM, Fan Liya  wrote:
> 
> Hi Ruben and Stamatis,
> 
> Thanks a lot for your kind help. I should have sent an email asking for
> help when I encountered the problem.
> I must be careful in the future to never repeat such mistake again.
> 
> In step 2, I found the following commits in site branch but not in master
> branch (through git log master..site):
> 
> commit d1324718fae633584a6c5f8e8f6238a0f851cbac
> Author: Ruben Quesada Lopez 
> Date:   Sat Jan 29 19:04:14 2022 +
> 
>Site: update PMC Chair
> 
> commit fabef057c536d56e10530b399543077abad03a24
> Author: Jing Zhang 
> Date:   Thu Jan 27 17:01:14 2022 +0800
> 
>Site: Add external resources section in the community page
> 
>Close apache/calcite#2703
> 
> commit 48f4bf8596ebfa0f7460ce9358d30028f268cb8e
> Author: Eugen Stan 
> Date:   Fri Jan 28 12:02:09 2022 +0200
> 
>Site: Add "calcite-clj - Use Calcite with Clojure" in talks section
> 
>Close apache/calcite#2704
> 
> commit d088cde8d74b9b9b0b157c2f5efe83fcedc6ffef
> Author: Alessandro Solimando 
> Date:   Mon Jan 10 11:48:29 2022 +0100
> 
>Site: Add Alessandro Solimando as committer
> 
> commit 664c4d3f9b0ec03071df6fef44b32169a275ea49
> Author: mans2singh 
> Date:   Wed Jan 5 19:41:56 2022 -0500
> 
>Site: Fix typo in howto.md
> 
> commit dd34953aa0b5af13e7e7fccd75b2f2e8fe9c8edd
> Author: nobigo 
> Date:   Fri Dec 31 21:28:20 2021 +0800
> 
>Site: Change the javadoc title to Apache Calcite API
> 
> commit d29aa09321ab14bd824e5465ddcfae8d16604134
> Author: Guiyanakuang 
> Date:   Wed Jan 5 10:12:03 2022 +0800
> 
>Site: For tables that display results, center the content horizontally
> 
>Close apache/calcite#2632
> 
> commit da4fc3be015977fb02bf021583a06a5f58278b66
> Author: Guiyanakuang 
> Date:   Fri Dec 10 11:27:59 2021 +0800
> 
>Site: Add syntax highlighting to SQL statements
> 
>Close apache/calcite#2632
> 
> commit cee9f670818116387810c5f89035975aa2694658
> Author: Guiyanakuang 
> Date:   Sun Dec 5 10:58:32 2021 +0800
> 
>Site: Improve HTML tables display & update CSV tutorial
> 
>1. Allow code pre-wrap in tables.
>2. Display horizontal scrollbar in tables when content is too large
>and cannot be wrapped.
>3. Update CSV tutorial example based on current code.
> 
>Close apache/calcite#2632
> 
> Later, I found that these changes were also in the master branch (with
> different commit hashes). However, since they entered the master branch in
> a different order, they got different git commit hashes, and git recognized
> them as different commits. So my operations above in effect changed the
> order of some commits (and also their commit hashes).
> 
> Stamatis, if you will help restore the master branch, I think I can help
> verify the operation, because I have a local backup.
> 
> When things are done, I think I will start another thread to discuss the
> problem, to avoid such things happening again in the future.
> 
> Thanks again for your kind help.
> 
> Best,
> Liya Fan
> 
> 
> 
> Julian Hyde  于2022年3月10日周四 08:00写道:
> 
>> That sounds good. I concur that "[CALCITE-5019] Avoid multiple scans when
>> table is ProjectableFilterableTable and projections and filters act on
>> different columns” is the last good commit and that its hash should be
>> dcbc493bf699d961427952c5efc047b76d859096.
>> 
>> Go ahead when you’re ready.
>> 
>> Optionally, when you’re done with that, perform the steps to get the
>> ‘site' branch in sync with the ‘master' branch. That will make it easier
>> for us to get this release done without any further mishaps.
>> 
>> Julian
>> 
>> 
>>> On Mar 9, 2022, at 1:00 PM, Stamatis Zampetakis 
>> wrote:
>>> 
>>> It's not the first time that we have had small problems with history so
>> no
>>> worries.
>>> Thankfully with the help of commit@calcite list we can always find a
>> way to
>>> fix things as long as we identify the problem soon enough.
>>>

Re: [VOTE] Release Apache Calcite 1.30.0 (release candidate 1)

2022-03-09 Thread Julian Hyde
Liya,

I made much the same comments as Ruben regarding the release notes.

Though we don’t usually use PRs for releases, I think it would be useful if we 
had a PR in which to review the release notes. It seems better than 
bike-shedding them during the actual release vote. If you agree, can you post 
that PR?

Julian


> On Mar 9, 2022, at 2:59 AM, Ruben Q L  wrote:
> 
> Thanks Liya Fan for being the release manager.
> 
> First of all, for the record regarding the RC itself:
> - Checksum and signature: ok
> - Gradle test: ok
> - Calcite-based application test suite: ok
> 
> IMO release notes would need some adjustments:
> - Traditionally, at the beginning of the release note, before the
> "Compatibility" paragraph, there should be another paragraph describing the
> highlights of the release (see previous releases as an example).
> - If I am not mistaken, there is a CVE to be mentioned in the release note,
> which does not seem to be there.
> - I would suggest to keep the traditional sections (see previous
> releases): New features; Bug-fixes, API changes and minor
> enhancements; Build and test suite; Dependency version upgrade; Web site
> and documentation
> - As Xiong said, bug fixes should not begin with "Fix..."; instead, the
> commit message (or Jira title) should be used (this should be the norm for
> all the tickets in the release note, not just the bug fixes).
> - This particular item "[CALCITE-3673] Support ListTransientTable without
> tables in the schema" is not a "New feature", but rather a "Minor
> enhancement" or "Bug fix". Its title in the release note is also
> misleading. I would suggest to keep the original Jira title / commit
> message: "ListTransientTable should not leave tables in the schema".
> - Apart from that, CALCITE-4976 mentions some modification to be made in
> howto.md.
> 
> Having said that, as Stamatis mentioned, I had also noticed something
> strange on the repository while I was checking some of tickets that I had
> resolved (the fact that the commit which resolved the issue does not belong
> to any repository any more). As Stamatis suggests, I think we should put on
> hold the RC process until we figure out what happened and how to handle the
> situation.
> 
> Best,
> Ruben
> 
> 
> On Wed, Mar 9, 2022 at 10:13 AM Stamatis Zampetakis 
> wrote:
> 
>> Thanks for preparing the RC Liya!
>> 
>> The repository seems to be in some weird state (see [1]) so I don't feel
>> comfortable getting a release out till we understand what exactly has
>> happened.
>> Let's continue the discussion under the respective thread.
>> 
>> -1 (binding)
>> 
>> Stamatis
>> 
>> [1] https://lists.apache.org/thread/pw7q8tqn78rwq0sfxh6o2rxvwrjgjlkq
>> 
>> On Mon, Mar 7, 2022 at 12:27 PM Enrico Olivelli 
>> wrote:
>> 
>>> +1 (non binding)
>>> 
>>> Run all tests of HerdDB, all tests are passing without any code
>>> change. (https://github.com/diennea/herddb/pull/779)
>>> 
>>> I would like to note that previous tests of HerdDB did not pass with
>>> 1.29.0 (and we were stuck to 1.28.0). With 1.29.0 tests on Github
>>> actions hang during some classloading of some Calcite classes
>>> 
>>> thank you Liya
>>> 
>>> Enrico
>>> 
>>> Il giorno lun 7 mar 2022 alle ore 09:20 Fan Liya
>>>  ha scritto:
 
 Hi Xiong,
 
 Thanks for your feedback. I will change the logs in a follow up MR.
 
 Best,
 Liya Fan
 
 
 xiong duan  于2022年3月7日周一 15:30写道:
 
> Hi, Liya Fan.Thanks for your work.  +1.
> Just a little problem:
> I noticed the Bug-fixes logs all started with FixI think maybe we
> should stay the same as JIRA summary or PR commit?
> 
> Fan Liya  于2022年3月7日周一 14:55写道:
> 
>> Hi all,
>> 
>> I have created a build for Apache Calcite 1.30.0, release
>> candidate 1.
>> 
>> Thanks to everyone who has contributed to this release.
>> 
>> You can read the release notes here:
>> 
>> 
> 
>>> 
>> https://github.com/apache/calcite/blob/calcite-1.30.0-rc1/site/_docs/history.md
>> 
>> The commit to be voted upon:
>> 
>> 
> 
>>> 
>> https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=f14cf4c32b9079984a988bbad40230aa6a59b127
>> 
>> Its hash is f14cf4c32b9079984a988bbad40230aa6a59b127
>> 
>> Tag:
>> https://github.com/apache/calcite/tree/calcite-1.30.0-rc1
>> 
>> The artifacts to be voted on are located here:
>> 
>>> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.30.0-rc1
>> (revision 52897)
>> 
>> The hashes of the artifacts are as follows:
>> 
>> 
> 
>>> 
>> 25527b5dfd3c28d4ac3c9d9a40b94dbcbce7feb17cc198ebe7c36a30c2df69a27ee5e4defab4edff1e1bc65c076bd0725bd8c378913d1d23f3d80f732e3e097f
>> *apache-calcite-1.30.0-src.tar.gz
>> 
>> A staged Maven repository is available for review at:
>> 
>> 
> 
>>> 
>> https://repository.apache.org/content/repositories/orgapachecalcite-1150/org/apache/calcite/

Re: Broken history after force pushing to master

2022-03-09 Thread Julian Hyde
That sounds good. I concur that "[CALCITE-5019] Avoid multiple scans when table 
is ProjectableFilterableTable and projections and filters act on different 
columns” is the last good commit and that its hash should be 
dcbc493bf699d961427952c5efc047b76d859096.

Go ahead when you’re ready.

Optionally, when you’re done with that, perform the steps to get the ‘site' 
branch in sync with the ‘master' branch. That will make it easier for us to get 
this release done without any further mishaps.

Julian


> On Mar 9, 2022, at 1:00 PM, Stamatis Zampetakis  wrote:
> 
> It's not the first time that we have had small problems with history so no
> worries.
> Thankfully with the help of commit@calcite list we can always find a way to
> fix things as long as we identify the problem soon enough.
> 
> According to the change log [1] the last commit before force pushing was
> (dcbc493), which corresponds to CALCITE-5019.
> 
> * -- * -- B -- O -- O -- O (dcbc493)
> \
> N -- N -- N refs/heads/master (c3dbf52)
> 
> According to [2] the full commit id
> is dcbc493bf699d961427952c5efc047b76d859096.
> 
> In order to restore the master branch in the state that it was before the
> force-push (before release) I plan to do the following steps:
> 
> git fetch origin dcbc493bf699d961427952c5efc047b76d859096
> git checkout dcbc493bf699d961427952c5efc047b76d859096
> git branch -D master
> git switch -c master
> git push origin master -f
> 
> I will apply the above sequence in 12h from now to give some time to others
> to react if necessary.
> 
> Obviously this will nuke out any current release candidate so we will need
> to cancel existing votes and create an RC2.
> 
> There has been a force push also to the site branch but doesn't matter much
> since we can force push master to site after the release is finalized.
> 
> Best,
> Stamatis
> 
> [1] https://lists.apache.org/thread/gkvn5hlmm3jlcklgw9k9nodyhxvqmsw4
> [2] https://lists.apache.org/thread/rvngk5tygfoyoc0klhwpo717mrngkdrw
> 
> 
> On Wed, Mar 9, 2022 at 6:44 PM Ruben Q L  wrote:
> 
>> Hello Liya,
>> 
>> No worries, we all make mistakes.
>> I think the sequence of steps that you describe looks like a plausible
>> explanation for how we get into this situation. Do you know (from step 2)
>> which commits were in site branch that were not in master?
>> If in the future you (or anybody else) get blocked or experience any
>> problem on a certain step during the release process, do not hesitate to
>> send an email to the dev list with subject "[HELP] ..." describing the
>> issue. In my experience, someone from the community will assist relatively
>> fast.
>> 
>> Any git expert with a clear idea on how to restore the master branch?
>> 
>> Best,
>> Ruben
>> 
>> 
>> On Wed, Mar 9, 2022 at 1:32 PM Fan Liya  wrote:
>> 
>>> Hi all,
>>> 
>>> I think the broken history was caused by this:
>>> 
>>> 1. In document "Making a release candidate [1]", it says "Make sure
>> master
>>> branch and site branch are in sync".
>>> 2. I checked the two branches, and find they have diverged. Some commits
>> in
>>> the site branch are not in the master branch.
>>> 3. I tried the method given in the document "git reset --hard site", but
>> it
>>> didn't work.
>>> 3. I tried to cherry-pick the commits to master, but it required
>> resolving
>>> conflicts, because the committing order was not correct.
>>> 4. So I used "git rebase -i" to insert the commits into the "right" place
>>> of the master branch.
>>> 5. Finally, I pushed the result to the original master branch.
>>> 
>>> I think that is the reason for the broken history. Really sorry for the
>>> trouble.
>>> If needed, I can restore the original master branch. I have backed up the
>>> branch.
>>> 
>>> Best,
>>> Liya Fan
>>> 
>>> [1]
>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
>>> 
>>> xiong duan  于2022年3月9日周三 19:35写道:
>>> 
 Hi. Stamatis. I agree we need to address this issue first.
 I find some relative descriptions at end of the email
 https://lists.apache.org/thread/gkvn5hlmm3jlcklgw9k9nodyhxvqmsw4. So
>> it
>>> is
 a force push. Sorry I am not very good at Github job flow. But I think
>> it
 describes what happened according to the appearances. So I hope this
>> can
 help.
 
 This update added new revisions after undoing existing revisions. That
>> is
 to say, some revisions that were in the old version of the branch are
>> not
 in the new version. This situation occurs when a user --force pushes a
 change and generates a repository containing something like this: * --
>> *
>>> --
 B -- O -- O -- O (dcbc493) \ N -- N -- N refs/heads/master (c3dbf52)
>> You
 should already have received notification emails for all of the O
 revisions, and so the following emails describe only the N revisions
>> from
 the common base, B. Any revisions marked "omit" are not gone; other
 references still refer to them. Any revisions marked "discard" are gone
 forever. The 41 

[jira] [Created] (CALCITE-5037) Upgrade HSQLDB to 2.5.1

2022-03-09 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-5037:


 Summary: Upgrade HSQLDB to 2.5.1
 Key: CALCITE-5037
 URL: https://issues.apache.org/jira/browse/CALCITE-5037
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde


Upgrade HSQLDB to 2.5.1 (the last version that also supports Java 8) and allow 
people to run using HSQLDB 2.6.1 (the most recent version, which requires Java 
11 or higher).

We cannot currently do this because we depend on scott-data-hsqldb:0.1, 
foodmart-data-hsqldb:0.4 and chinook-data-hsqldb:0.1, and these have embedded 
data files in HSQLDB 1.8 format, which HSQLDB 2.5.1 no longer supports.

Per release [release notes|http://hsqldb.org/doc/2.0/changelist_2_0.txt]:

{quote}
21 March 2021 - version 2.6.0
...
-- version 2.6.0 jar requires JRE 11 or later - tested up to Java 16 RC
-- version 2.6.0 alternative jar requires JRE 8 or later
...

24 June 2020 - version 2.5.1
version 2.5.1 jar requires JRE 8 or later - tested up to Java 14
... 
removed legacy support for opening version 1.8 databases
{quote}

I have released scott-data-hsqldb:0.2 and foodmart-data-hsqldb:0.5, and when we 
have chinook-data-hsqldb:0.2 we can upgrade to them and should be able to go to 
HSQLDB 2.5.1 or 2.6 (on Java 11, or on Java 8 with the alternative jar).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: Bugs dropped on the floor

2022-03-07 Thread Julian Hyde
Especially https://issues.apache.org/jira/browse/CALCITE-4976 
<https://issues.apache.org/jira/browse/CALCITE-4976> whose title is literally 
“Release Calcite 1.30.0”, no longer has fix-version 1.30. Good grief.


> On Mar 7, 2022, at 2:22 PM, Julian Hyde  wrote:
> 
> There were a few bugs that had fix-version = 1.30 because they had PRs that 
> looked like they were going to make it. Liya has removed the fix version.
> 
> https://issues.apache.org/jira/browse/CALCITE-4908 
> <https://issues.apache.org/jira/browse/CALCITE-4908>
> https://issues.apache.org/jira/browse/CALCITE-2552 
> <https://issues.apache.org/jira/browse/CALCITE-2552> 
> https://issues.apache.org/jira/browse/CALCITE-4987 
> <https://issues.apache.org/jira/browse/CALCITE-4987> 
> https://issues.apache.org/jira/browse/CALCITE-4913 
> <https://issues.apache.org/jira/browse/CALCITE-4913> 
> 
> Is that where we’re going to leave it?
> 
> It breaks my heart that there we are ignoring good PRs.
> 
> Julian
> 



Bugs dropped on the floor

2022-03-07 Thread Julian Hyde
There were a few bugs that had fix-version = 1.30 because they had PRs that 
looked like they were going to make it. Liya has removed the fix version.

https://issues.apache.org/jira/browse/CALCITE-4908 

https://issues.apache.org/jira/browse/CALCITE-2552 
 
https://issues.apache.org/jira/browse/CALCITE-4987 
 
https://issues.apache.org/jira/browse/CALCITE-4913 
 

Is that where we’re going to leave it?

It breaks my heart that there we are ignoring good PRs.

Julian



Re: [VOTE] Release Apache Calcite 1.30.0 (release candidate 0)

2022-03-07 Thread Julian Hyde
Liya, in the next iteration can you go back to the usual sections e.g 
“Bug-fixes, API changes and minor enhancements”. Also remove the “Fix” prefix 
from messages.

Quite a few cases, e.g “ Fix the problem that JDBC adapter incorrectly adds 
ORDER BY columns to the SELECT list of generated SQL query” are in the gray 
area between bug and missing feature, and using the original message helps 
people make their own decision. 

I also think there’s value in putting changes to build and test in a separate 
section, as we used to do. But please move my “Fluent test fixtures” into the 
“new features” section - because it is a major new feature. 
  
Julian

> On Mar 6, 2022, at 10:31 PM, Francis Chuang  wrote:
> 
> Thanks so much Liya! Please cancel this vote by sending a [CANCEL] message 
> as well, to formalize the cancellation of this vote.
> 
> Francis
> 
>> On 7/03/2022 5:27 pm, Fan Liya wrote:
>> After some investigation, I could not resolve the 404 issue of the commit
>> link (In my computer, it is not 404, but an error with XML format).
>> So I have created a new build, which has resolved all the above issues
>> (according to the test on my local computer).
>> Let's start another vote based on the new build. Thanks.
>> Best,
>> Liya Fan
>> Fan Liya  于2022年3月7日周一 09:53写道:
>>> Thanks for your feedback, Francis.
>>> I will check.
>>> 
>>> Best,
>>> Liya Fan
>>> 
>>> 
>>> Francis Chuang  于2022年3月7日周一 09:36写道:
>>> 
 Hey Liya,
 
 Thanks for being RM for this release.
 
 Can you check the commit to be voted upon?
 
 https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=28aacc0bbccfbcd821fece72ae3d92f6c1db8a5e
 returns a 404
 
 Also, tag calcite-1.30.0-rc0 points to a39b9bf
 (https://github.com/apache/calcite/releases/tag/calcite-1.30.0-rc0),
 which does not seem to be in the Calcite repository, see:
 
 https://github.com/apache/calcite/commit/a39b9bf8b1cbab8f4462c41f6b4506753f859f96
 
 Thanks!
 Francis
 
 On 5/03/2022 8:56 pm, Fan Liya wrote:
> Hi all,
> 
> I have created a build for Apache Calcite 1.30.0, release
> candidate 0.
> 
> Thanks to everyone who has contributed to this release.
> 
> You can read the release notes here:
> 
 https://github.com/apache/calcite/blob/calcite-1.30.0-rc0/site/_docs/history.md
> 
> The commit to be voted upon:
> 
 https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=28aacc0bbccfbcd821fece72ae3d92f6c1db8a5e
> 
> Its hash is 28aacc0bbccfbcd821fece72ae3d92f6c1db8a5e
> 
> Tag:
> https://github.com/apache/calcite/tree/calcite-1.30.0-rc0
> 
> The artifacts to be voted on are located here:
> 
 https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.30.0-rc0
> (revision 52873)
> 
> The hashes of the artifacts are as follows:
> 
 339b279426abfc09a73b0daf94479127aa469b1cd3fa1b1cf0842f57d64daf0e080d4e85ccdd2d32032ed424a39388790743adf2aec869480612e37003eabfc5
> *apache-calcite-1.30.0-src.tar.gz
> 
> A staged Maven repository is available for review at:
> 
 https://repository.apache.org/content/repositories/orgapachecalcite-1148/org/apache/calcite/
> 
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/COMMITTER_ID.asc
> https://www.apache.org/dist/calcite/KEYS
> 
> To create the jars and test Apache Calcite: "gradle build"
> (requires an appropriate Gradle/JDK installation)
> 
> Please vote on releasing this package as Apache Calcite 1.30.0.
> 
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Calcite 1.30.0
> [ ]  0 I don't feel strongly about it, but I'm okay with the release
> [ ] -1 Do not release this package because...
> 
> Here is my vote:
> 
> +1 (binding)
> 
> Best,
> Liya Fan
> 
 
>>> 


Re: Could calcite support SortProjectPullUpConstantsRule?

2022-03-07 Thread Julian Hyde
Sounds reasonable. Please log a jira. 

Consider implementing this rule by inspecting predicates or unique keys, rather 
than by matching a Project or Filter. If you can ascertain that a sort column 
has only one value (counting NULL as a value for these purposes) then you can 
remove it from the sort. 

I find that using metadata in this way is very effective. You get the desired 
effect without moving relational operators around. 

Julian

> On Mar 7, 2022, at 1:01 AM, 徐仁和  wrote:
> 
> Hi community:
> 
> Could calcite support the rule to pull up constant project from sort, just
> like `AggregateProjectPullUpConstantsRule`?
> 
> -- sql
> select pay_amount, pay_id, user_id
> from pay_tbl
> where pay_id = 1234
> group by pay_amount, pay_id, user_id
> order by pay_amount, pay_id, user_id
> 
> -- origin rel tree
> LogicalSort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC],
> dir2=[ASC])
>  LogicalProject(pay_amount=[$0], pay_id=[1234], user_id=[$1])
>LogicalAggregate(group=[{0, 1}])
>  LogicalProject(pay_amount=[$1], user_id=[$3])
>LogicalFilter(condition=[=($0, 1234)])
>  LogicalTableScan(table=[[default, pay_tbl]])
> 
> -- optimized rel tree
> LogicalProject(pay_amount=[$0], pay_id=[1234], user_id=[$1])
>  LogicalSort(sort0=[$0], sort2=[$1], dir0=[ASC], dir2=[ASC])
>LogicalProject(pay_amount=[$0], user_id=[$1])
>  LogicalAggregate(group=[{0, 1}])
>LogicalProject(pay_amount=[$1], user_id=[$3])
>  LogicalFilter(condition=[=($0, 1234)])
>LogicalTableScan(table=[[default, pay_tbl]])


Re: [Discuss] The RexSimplify about "a <> 1 and a = 2"

2022-03-03 Thread Julian Hyde
Note that https://issues.apache.org/jira/browse/CALCITE-1794 
 deals with very similar 
expressions (albeit with CAST, and optimized using SARGs rather than 
RexSimplify) and was just fixed today.

> On Mar 2, 2022, at 4:52 PM, Benchao Li  wrote:
> 
> Hi xiong,
> 
> For the simplification result below:
> # "a <> 1 or a = 1 [a is nullable]"
> 1. UNKNOWN AS UNKNOWN: OR(IS NOT NULL(a), null)
> 2. UNKNOWN AS TRUE: TRUE
> 3. UNKNOWN AS FALSE: IS NOT NULL(a)
> 
> # "a<>1 and a = 2 [a is nullable]"
> 1. UNKNOWN AS UNKNOWN: =(a, '2')
> 2. UNKNOWN AS TRUE: =(a, '2')
> 3. UNKNOWN AS FALSE: =(a, '2')
> 
> IMHO, this is expected and correct.
> For the first case, the result can be simplified to:
> if a is null:
>  null
> else:
>  true
> then, for each unknown value, we can evaluate to the results above
> correspondingly.
> 
> For the second case, the result is also correct, we can simplify `a<>1 and
> a = 2` to `a = 2` regardless of nullability of a,
> and `=(a, '2')` is the simplest form.
> 
> xiong duan  于2022年2月28日周一 10:26写道:
> 
>> Hi followers.
>>When I learned how to simplify the Rexnode in Calcite. I meet a problem
>> and need some help.
>> For example:
>> When we simplify  "a <> 1 or a = 1 [a is nullable]", (the result is
>> different according to how to treat the UNKNOW value) :
>> 
>> checkSimplify3_(or(ne(vInt(0), literal(1)),eq(vInt(0),
>> literal(1))),"OR(IS NOT NULL(?0.int0), null)","IS NOT
>> NULL(?0.int0)","true");
>> 
>> But when I try to simplify "a<>1 and a = 2 [a is nullable]":
>> 
>>checkSimplify3_(
>>and(ne(vInt(0), literal(1)),
>>eq(vInt(0), literal(2))),
>>"=(?0.int0, 2)",
>>"=(?0.int0, 2)",
>>"=(?0.int0, 2)");
>> 
>> And I think this result should be:
>> "a<>1 and a = 2 [a is nullable]":
>> 
>> UNKNOW AS TRUE: a IS NULL or a = 2
>> UNKNOW AS FALSE: a is NOT NULL and a =2
>> UNKNOW AS UNKNOW: CASE(IS NULL(a), null, a = 2)
>> 
>> So What did I misunderstand? It is a bug or intentional? Any feedback would
>> be appreciated.
>> 
> 
> 
> -- 
> 
> Best,
> Benchao Li



Re: calcite multi-threading problem

2022-02-26 Thread Julian Hyde
To repeat, it would help if there were a Jira case. I don’t want to debug 
problems with janino 3.1.6 unless I know why we’re upgrading to 3.1.6. Also, we 
have a duty to reduce volume on the dev list.

> On Feb 25, 2022, at 7:19 PM, xiaobo  wrote:
> 
> Hi Gavin and Julian,
> Thanks for your help, we know that upgrading janino and common compiler to 
> 3.1.6 will fix the "operand stack underflow" error, but we have also have 
> problem with 3.1.6 of janino as mentioned before:
> --
> 
> we checked the janino release note page
> http://janino-compiler.github.io/janino/changelog.html
> 
> Fixed issue #141: Unable to find org.codehaus.commons.compiler.properties in 
> java 11: ICompilerFactories were loaded through the current thread's "context 
> class loader", which is a bad choice in some environments. Thus, the methods 
> "getDefaultCompilerFactory()", "getAllCompilerFactories()" and 
> "getCompilerFactory(className)" were duplicated and augmented with a 
> "classLoader" parameter. 
> 
> what does this mean?  the getDefaultCompilerFactory method with a classloader 
> parameter was removed intentionally?
> but it seems calcite still needs it .
> 
> 
> 
> What we mean here is that we think calcite has a bug with janino 3.1.6 , are 
> we right?
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- Original --
> From:  "Julian Hyde";;
> Send time: Saturday, Feb 26, 2022 1:24 AM
> To: "dev"; 
> 
> Subject:  Re: calcite multi-threading problem
> 
> 
> 
> Should a jira case be logged for this issue? Or is there an existing case? If 
> the latter, should we add a stack trace to help people tie it to this problem?
> 
> A jira URL in this thread will help other people who run into this problem in 
> future. Right now it is just a rambling email thread with an off-topic 
> subject.
> 
> Julian
> 
>> On Feb 25, 2022, at 7:18 AM, Gavin Ray  wrote:
>> 
>> Also I just checked and this is updated on master so it should be solved in
>> next release FWIW:
>> 
>> https://github.com/apache/calcite/blob/cbbe5701b7f61d7f8df12d314ba5aabf898c1cae/gradle.properties#L116
>> 
>>> On Fri, Feb 25, 2022 at 10:16 AM Gavin Ray  wrote:
>>> 
>>> The fix for this is just to add janino and commons-compiler as exclusions
>>> in the calcite/avatica dependencies
>>> And then add the updated version so that it overwrites them
>>> 
>>> I tested this on your code and it worked
>>> 
>>> Also, please listen to Stamatis/Julian advice
>>> 
>>> 
>>> 
>>> org.apache.calcite
>>> calcite-core
>>> 1.29.0
>>> 
>>> 
>>> org.codehaus.janino
>>> janino
>>> 
>>> 
>>> org.codehaus.janino
>>> commons-compiler
>>> 
>>> 
>>> 
>>> 
>>> 
>>> org.apache.calcite.avatica
>>> avatica-core
>>> 1.20.0
>>> 
>>> 
>>> org.codehaus.janino
>>> janino
>>> 
>>> 
>>> org.codehaus.janino
>>> commons-compiler
>>> 
>>> 
>>> 
>>> 
>>> 
>>> org.apache.calcite
>>> calcite-file
>>> 1.29.0
>>> 
>>> 
>>> org.codehaus.janino
>>> janino
>>> 
>>> 
>>> org.codehaus.janino
>>> commons-compiler
>>> 
>>> 
>>> 
>>> 
>>> 
>>> org.apache.calcite
>>> calcite-example-csv
>>> 1.21.0
>>> 
>>> 
>>> org.codehaus.janino
>>> janino
>>> 
>>> 
>>> org.codehaus.janino
>>> commons-compiler
>>> 
>>> 
>>> 
>>> 
>>> 
>>> org.codehaus.janino
>>> janino
>>> 3.1.6
>>> 
>>> 
>>> 
>>> org.codehaus.janino
>>> commons-compiler
>>> 3.1.6
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Feb 25, 2022 at 9:33 AM xiaobo 
>>> wrote:
>>> 
>>>> I have put the testing code at github
>>>> 
>>>> 
>>>> https://github.com/guxiaobo/gxb-testing/blob/main/calcite-testing/src/test/java/com/xsmartware/testing/calcite/Test1.java
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- Original --
>>>> From:  "xiaobo ";;
>>>> Send time: Friday, Feb 25, 2022 9:05 

Re: calcite multi-threading problem

2022-02-25 Thread Julian Hyde
Should a jira case be logged for this issue? Or is there an existing case? If 
the latter, should we add a stack trace to help people tie it to this problem?

A jira URL in this thread will help other people who run into this problem in 
future. Right now it is just a rambling email thread with an off-topic subject.

Julian

> On Feb 25, 2022, at 7:18 AM, Gavin Ray  wrote:
> 
> Also I just checked and this is updated on master so it should be solved in
> next release FWIW:
> 
> https://github.com/apache/calcite/blob/cbbe5701b7f61d7f8df12d314ba5aabf898c1cae/gradle.properties#L116
> 
>> On Fri, Feb 25, 2022 at 10:16 AM Gavin Ray  wrote:
>> 
>> The fix for this is just to add janino and commons-compiler as exclusions
>> in the calcite/avatica dependencies
>> And then add the updated version so that it overwrites them
>> 
>> I tested this on your code and it worked
>> 
>> Also, please listen to Stamatis/Julian advice
>> 
>> 
>> 
>> org.apache.calcite
>> calcite-core
>> 1.29.0
>> 
>> 
>> org.codehaus.janino
>> janino
>> 
>> 
>> org.codehaus.janino
>> commons-compiler
>> 
>> 
>> 
>> 
>> 
>> org.apache.calcite.avatica
>> avatica-core
>> 1.20.0
>> 
>> 
>> org.codehaus.janino
>> janino
>> 
>> 
>> org.codehaus.janino
>> commons-compiler
>> 
>> 
>> 
>> 
>> 
>> org.apache.calcite
>> calcite-file
>> 1.29.0
>> 
>> 
>> org.codehaus.janino
>> janino
>> 
>> 
>> org.codehaus.janino
>> commons-compiler
>> 
>> 
>> 
>> 
>> 
>> org.apache.calcite
>> calcite-example-csv
>> 1.21.0
>> 
>> 
>> org.codehaus.janino
>> janino
>> 
>> 
>> org.codehaus.janino
>> commons-compiler
>> 
>> 
>> 
>> 
>> 
>> org.codehaus.janino
>> janino
>> 3.1.6
>> 
>> 
>> 
>> org.codehaus.janino
>> commons-compiler
>> 3.1.6
>> 
>> 
>> 
>> 
>> On Fri, Feb 25, 2022 at 9:33 AM xiaobo 
>> wrote:
>> 
>>> I have put the testing code at github
>>> 
>>> 
>>> https://github.com/guxiaobo/gxb-testing/blob/main/calcite-testing/src/test/java/com/xsmartware/testing/calcite/Test1.java
>>> 
>>> 
>>> 
>>> 
>>> -- Original --
>>> From:  "xiaobo ";;
>>> Send time: Friday, Feb 25, 2022 9:05 PM
>>> To: "dev";
>>> 
>>> Subject:  Re:calcite multi-threading problem
>>> 
>>> 
>>> 
>>> Hi,
>>> Now we come to the  problem similar as the one at
>>> https://github.com/apache/calcite/pull/2433#issuecomment-860024076,
>>> 
>>> a simple sql "select max(id) from s.t1" caused the following exception,
>>> users report upgrading janino to 3.1.6 can fix this problem, but as we
>>> discussed before janino 3.1.6 is not fully compatible with calcite 1.29.0
>>> because of the dependency issue.
>>> 
>>> 
>>> 
>>>at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
>>> ~[avatica-core-1.20.0.jar:1.20.0]
>>>at
>>> org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:130)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1130)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.prepare_(CalcitePrepareImpl.java:1032)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.prepareRel(CalcitePrepareImpl.java:988)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:668)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:513)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:483)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:249)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>at
>>> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:229)
>>> ~[calcite-core-1.29.0.jar:1.29.0]
>>>... 13 common frames omitted
>>> Caused by: org.codehaus.commons.compiler.InternalCompilerException:
>>> Compiling "Baz" in File
>>> '/Users/linda/Downloads/janino/janino6606844895950068558.java', Line 1,
>>> Column 1: File
>>> '/Users/linda/Downloads/janino/janino6606844895950068558.java', Line 41,
>>> Column 45: Compiling "bind(org.apache.calcite.DataContext root)": File
>>> '/Users/linda/Downloads/janino/janino6606844895950068558.java', Line 60,
>>> Column 64
>>>at
>>> org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:369)
>>> ~[janino-3.1.4.jar:na]
>>>at
>>> org.codehaus.janino.UnitCompiler.access$000(UnitCompiler.java:231)
>>> ~[janino-3.1.4.jar:na]
>>>at
>>> org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:333)
>>> ~[janino-3.1.4.jar:na]
>>>at
>>> 

Re: [Discuss] Converting SELECT Fields to STAR may cause duplicate column exception when executing ctas

2022-02-24 Thread Julian Hyde
Which code is generating this "SELECT *" query? It seems wrong to
generate "SELECT *". It should generate a list of columns with unique
names instead.

On Thu, Feb 24, 2022 at 3:41 AM Yanjing Wang  wrote:
>
> Hi community,
>
> I'm trying to convert a plan to a sql of SPARK dialect, but sometimes the
> fields will be converted to star, see the following example
>
> *create table users(id int);*
> *create table depts(id int);*
>
> *select a.id , b.id  as id0 from users1 a, depts1
> b*
>
> converting to plan results
> LogicalProject(id=[$0], id0=[$1])
>   LogicalJoin(condition=[true], joinType=[inner])
> HiveTableScan(table=[[default, users1]])
> HiveTableScan(table=[[default, depts1]])
>
> converting the plan to SPARK sql results
>
>
> *SELECT *FROM `default`.`users1`CROSS JOIN `default`.`depts1`*
>
> because PROJECT and JOIN row type are identical, the *PROJECT* has been
> converted to *SELECT ** .
>
> the problem is the SPARK sql can't be used to create table, such as
>
>
> *CREATE TABLE tmp AS SELECT *FROM `default`.`users1`CROSS JOIN
> `default`.`depts1`*
>
> this ctas will throw  '*Found duplicate column(s) in the table definition'*
>
> I wonder if we can add a config to indicate the *PROJECT *couldn't be
> converted to star.
>
> how do you think?


Re: [DISCUSS] Release Managers

2022-02-24 Thread Julian Hyde
Put me down for 1.33

On Thu, Feb 24, 2022 at 7:39 AM Jess Balint  wrote:
>
> Hi Stamatis,
>
> I can do 1.32.0 (or whatever makes sense).
>
> Jess
>
> On Thu, Feb 24, 2022 at 5:11 AM Stamatis Zampetakis 
> wrote:
>
> > Hi all,
> >
> > In the last 2 years or so we had the following people helping out with
> > releasing Calcite.
> >
> > 1.22.0 Danny Chan
> > 1.23.0 Haisheng Yuan
> > 1.24.0 Chunwei Lei
> > 1.25.0 Andrei Sereda
> > 1.26.0 Ruben Quesada Lopez
> > 1.27.0 Stamatis Zampetakis
> > 1.28.0 Julian Hyde
> > 1.29.0 Rui Wang
> >
> > Big thanks to everyone!
> >
> > Now we should see who can help out with the next releases. Any volunteers
> > for 1.32.0 to 1.34.0?
> >
> > 1.30.0 Andrei Seread
> > 1.31.0 Liya Fan
> > 1.32.0
> > 1.33.0
> > 1.34.0
> >
> > Best,
> > Stamatis
> >


Re: dynamic reflective schema

2022-02-24 Thread Julian Hyde
Yes. As I said, DOUBLE is basically a synonym for FLOAT in SQL. 

Julian

> On Feb 24, 2022, at 06:13, Stamatis Zampetakis  wrote:
> 
> The JDBC specification [1] defines the mapping between JDBC/SQL types to
> Java types (check TABLE B-1 JDBC Types Mapped to Java Types).
> 
> Best,
> Stamatis
> 
> [1] https://jcp.org/aboutJava/communityprocess/mrel/jsr221/index3.html
> 
>> On Thu, Feb 24, 2022 at 1:48 PM xiaobo  wrote:
>> 
>> What we want to do is mapping Java LocalTime to SQL Time, from the csv
>> adapter example we know that we should pass a long value for SQL Time, but
>> we can't convert a Java LocalTime to Java Date to call it's getTime method
>> like the CSV adapter, then what value should we calculate from  Java
>> LocalTime  for SQL Time.
>> 
>> 
>> 
>> 
>> -- Original --
>> From:  "xiaobo ";;
>> Send time: Thursday, Feb 24, 2022 12:26 PM
>> To: "dev";
>> 
>> Subject:  Re: dynamic reflective schema
>> 
>> 
>> 
>> our adapter now supports
>> string,boolean,bigdecimal,long,localdate,localdatetime, and we want to
>> support localtime too, what value should we convert from localtime data ,
>> the total seconds since the start of a day?
>> 
>> 
>> ---Original---
>> From: "xiaobo "
>> Date: Thu, Feb 24, 2022 12:20 PM
>> To: "dev";
>> Subject: Re: dynamic reflective schema
>> 
>> yes, we call the
>> getRowType(RelDataTypeFactory typeFactory) method in scan method , is this
>> the correct way?
>> 
>> 
>> 
>> ---Original---
>> From: "Gavin Ray"
>> Date: Thu, Feb 24, 2022 00:21 AM
>> To: "dev";
>> Subject: Re: dynamic reflective schema
>> 
>> Xiabo,
>> 
>> Your code seems to work for me, I ran the test case and it passes, see the
>> image below:
>> https://i.imgur.com/D0ajyQH.png
>> 
>> 
>> On Wed, Feb 23, 2022 at 2:33 AM xiaobo 
>> wrote:
>> 
>>> we have put the test coding on github
>>> 
>>> 
>>> 
>> https://github.com/guxiaobo/calcite-json-adapter/tree/main/src/test/java/org/apache/calcite/adapter/json/test
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- Original --
>>> From:  "xiaobo ";;
>>> Send time: Wednesday, Feb 23, 2022 3:11 PM
>>> To: "dev";
>>> 
>>> Subject:  Re:  dynamic reflective schema
>>> 
>>> 
>>> 
>>> here is the full output with calcite.debug=true, we know the problem is
>>> because the  getRowType(RelDataTypeFactory typeFactory) method is not
>>> called by the framework, but don't know why.
>>> 
>>> 
>>> 2022-02-23 15:02:42.228  INFO 78585 --- [   main]
>>> c.x.javatest.JavaTestApplication : Starting JavaTestApplication
>>> using Java 11.0.13 on LindadeMacBook-Air.local with PID 78585
>>> 
>> (/Users/linda/Documents/Github/sw-smart-risk/sw-pbcm/sw-pbcm-test/java-test-proj/target/classes
>>> started by linda in
>>> 
>> /Users/linda/Documents/Github/sw-smart-risk/sw-pbcm/sw-pbcm-test/java-test-proj)
>>> 2022-02-23 15:02:42.234  INFO 78585 --- [   main]
>>> c.x.javatest.JavaTestApplication : No active profile set, falling
>>> back to default profiles: default
>>> 2022-02-23 15:02:43.230  WARN 78585 --- [   main]
>>> o.m.s.mapper.ClassPathMapperScanner  : No MyBatis mapper was found in
>>> '[com.xsmartware.javatest]' package. Please check your configuration.
>>> 2022-02-23 15:02:44.064  INFO 78585 --- [   main]
>>> o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with
>> port(s):
>>> 8080 (http)
>>> 2022-02-23 15:02:44.082  INFO 78585 --- [   main]
>>> o.apache.catalina.core.StandardService   : Starting service [Tomcat]
>>> 2022-02-23 15:02:44.082  INFO 78585 --- [   main]
>>> org.apache.catalina.core.StandardEngine  : Starting Servlet engine:
>> [Apache
>>> Tomcat/9.0.56]
>>> 2022-02-23 15:02:44.291  INFO 78585 --- [   main]
>>> o.a.c.c.C.[Tomcat].[localhost].[/]   : Initializing Spring embedded
>>> WebApplicationContext
>>> 2022-02-23 15:02:44.292  INFO 78585 --- [   main]
>>> w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext:
>>> initialization completed in 1970 ms
>>> 2022-02-23 15:02:45.374  INFO 78585 --- [   main]
>>> o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s):
>> 8080
>>> (http) with context path ''
>>> 2022-02-23 15:02:45.393  INFO 78585 --- [   main]
>>> c.x.javatest.JavaTestApplication : Started JavaTestApplication in
>>> 3.77 seconds (JVM running for 5.146)
>>> /*
>>> * Licensed to the Apache Software Foundation (ASF) under one or more
>>> * contributor license agreements.  See the NOTICE file distributed with
>>> * this work for additional information regarding copyright ownership.
>>> * The ASF licenses this file to you under the Apache License, Version
>> 2.0
>>> * (the "License"); you may not use this file except in compliance with
>>> * the License.  You may obtain a copy of the License at
>>> *
>>> * http://www.apache.org/licenses/LICENSE-2.0
>>> *
>>> * Unless required by applicable law or agreed to in writing, software
>>> * distributed under the License 

Re: dynamic reflective schema

2022-02-23 Thread Julian Hyde
SQL’s FLOAT type (aka DOUBLE) corresponds to Java’s double. 
SQL’s REAL type corresponds to Java’s float. 

JDBC method names (e.g. getFloat, getDouble) refer to the Java types. 

Julian

> On Feb 23, 2022, at 10:33 PM, xiaobo  wrote:
> 
> When trying to map Java Float Objects to SQL Float, we faild with the 
> following , it seems calcite is trying to convert Float objects to Double 
> even when we expect it to be a float ( we fetch it through getFloat method of 
> resultset ).
> 
> 
> java.lang.ClassCastException: class java.lang.Float cannot be cast to class 
> java.lang.Double (java.lang.Float and java.lang.Double are in module 
> java.base of loader 'bootstrap')
>at Baz$1$1.current(Unknown Source)
>at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:687)
>at 
> org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
>at 
> org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:219)
>at 
> org.apache.calcite.adapter.json.test.BaseTest.exeGetFloat(BaseTest.java:135)
>at 
> org.apache.calcite.adapter.json.test.JsonSchemaTest.test10(JsonSchemaTest.java:61)
>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
>at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
>at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
>at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
>at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
>at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
>at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
> 
> 
> 
> 
> 
> 
> -- Original --
> From:  "xiaobo ";;
> Send time: Thursday, Feb 24, 2022 12:26 PM
> To: "dev"; 
> 
> Subject:  Re: dynamic reflective schema
> 
> 
> 
> our adapter now supports 
> string,boolean,bigdecimal,long,localdate,localdatetime, and we want to 
> support localtime too, what value should we convert from localtime data , the 
> total seconds since the start of a day?
> 
> 
> 
> ---Original---
> From: "xiaobo "
> Date: Thu, Feb 24, 2022 12:20 PM
> To: "dev";
> Subject: Re: dynamic reflective schema
> 
> 
> yes, we call the 
> getRowType(RelDataTypeFactory typeFactory) method in scan method , is this 
> the correct way?
> 
> 
> 
> 
> ---Original---
> From: "Gavin Ray"
> Date: Thu, Feb 24, 2022 00:21 AM
> To: "dev";
> Subject: Re: dynamic reflective schema
> 
> 
> Xiabo,
> 
> Your code seems to work for me, I ran the test case and it passes, see the
> image below:
> https://i.imgur.com/D0ajyQH.png
> 
> 
>> On Wed, Feb 23, 2022 at 2:33 

Re: Why are nested aggregations illegal? Best alternatives?

2022-02-23 Thread Julian Hyde
It’s a parser error. That usually means that the user has made a mistake. 

Try your SQL on another DB with JSON support before you declare this a Calcite 
bug. 

Julian

> On Feb 23, 2022, at 09:22, Gavin Ray  wrote:
> 
> No dice still unfortunately =/
> 
> If it's any easier, I put a self-contained single class reproduction on
> Github:
> https://github.com/GavinRay97/calcite-nested-json-subquery-bug/blob/main/src/test/java/com/example/AppTest.java
> 
> Is it worth filing a JIRA ticket over you think?
> 
> 
>> On Wed, Feb 23, 2022 at 3:45 AM Julian Hyde  wrote:
>> 
>> Try ‘value ((‘ in place of ‘value (‘.
>> 
>> Julian
>> 
>>>> On Feb 21, 2022, at 9:33 AM, Gavin Ray  wrote:
>>> 
>>> I hadn't thought about the fact that ORM's probably have to solve this
>>> problem as well
>>> That is a great suggestion, I will try to investigate some of the popular
>>> ORM codebases and see if there are any tricks they are using.
>>> 
>>> I seem to maybe be getting a tiny bit closer by using subqueries like
>>> Julian suggested instead of operator calls
>>> But if I may ask what is probably a very stupid question:
>>> 
>>> What might the error message
>>> "parse failed: Query expression encountered in illegal context
>>> (state=,code=0)"
>>> 
>>> Mean in the below query?
>>> 
>>> The reason why I am confused is because the query runs if I remove the
>>> innermost subquery ("todos")
>>> But the innermost subquery is a direct copy-paste of the subquery above
>> it,
>>> so I know it MUST be valid
>>> 
>>> As usual, thank you so much for your help/guidance Stamatis.
>>> 
>>> select
>>> "g0"."id" "id",
>>> "g0"."address" "address",
>>> (
>>>   select json_arrayagg(json_object(
>>> key 'id' value "g1"."id",
>>> key 'todos' value (
>>>   select json_arrayagg(json_object(
>>> key 'id' value "g2"."id",
>>> key 'description' value "g2"."description",
>>>   ))
>>>   from (
>>> select * from "todos"
>>> where "g1"."id" = "user_id"
>>> order by "id"
>>>   ) "g2"
>>> )
>>>   ))
>>>   from (
>>> select * from "users"
>>> where "g0"."id" = "house_id"
>>> order by "id"
>>>   ) "g1"
>>> ) "users"
>>> from "houses" "g0"
>>> order by "g0"."id"
>>> 
>>>> On Mon, Feb 21, 2022 at 8:07 AM Stamatis Zampetakis 
>>>> wrote:
>>>> 
>>>> Hi Gavin,
>>>> 
>>>> A few more comments in case they help to get you a bit further on your
>>>> work.
>>>> 
>>>> The need to return the result as a single object is a common problem in
>>>> object relational mapping (ORM) frameworks/APIS (JPA, Datanucleus,
>>>> Hibernate, etc.). Apart from the suggestions so far maybe you could look
>>>> into these frameworks as well for more inspiration.
>>>> 
>>>> Moreover your approach of decomposing the query into individual parts is
>>>> commonly known as the N+1 problem [1].
>>>> 
>>>> Lastly, keep in mind that you can introduce custom UDF, UDAF functions
>> if
>>>> you need more flexibility on reconstructing the final result.
>>>> 
>>>> Best,
>>>> Stamatis
>>>> 
>>>> [1]
>>>> 
>>>> 
>> https://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem-in-orm-object-relational-mapping
>>>> 
>>>>> On Sun, Feb 13, 2022 at 3:59 AM Gavin Ray 
>> wrote:
>>>>> 
>>>>> Ah wait nevermind, got excited and spoke too soon. Looking at it more
>>>>> closely, that data isn't correct.
>>>>> At least it's in somewhat the right shape, ha!
>>>>> 
>>>>>> On Sat, Feb 12, 2022 at 9:57 PM Gavin Ray 
>> wrote:
>>>>>

Re: Why are nested aggregations illegal? Best alternatives?

2022-02-23 Thread Julian Hyde
"houses"
>>>>)
>>>>);
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> ++-++--+
>>>> | id | user_id |  description   |
>>>>|
>>>> 
>>>> 
>>> 
>> ++-++--+
>>>> | 1  | 1   | Take out the trash | [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 2  | 1   | Watch my favorite show | [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 3  | 1   | Charge my phone| [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 4  | 2   | Cook dinner| [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 5  | 2   | Read a book| [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 6  | 2   | Organize office| [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 7  | 3   | Walk the dog   | [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> | 8  | 3   | Feed the cat   | [{1, John, 1, [{1, 1, Take
>> out
>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>> 
>>>> 
>>> 
>> ++-++------+
>>>> 
>>>> On Sat, Feb 12, 2022 at 4:13 PM Gavin Ray 
>> wrote:
>>>> 
>>>>> Nevermind, this is a standard term not something Calcite-specific it
>>>>> seems!
>>>>> 
>>>>> https://en.wikipedia.org/wiki/Correlated_subquery
>>>>> 
>>>>> On Sat, Feb 12, 2022 at 3:46 PM Gavin Ray 
>>> wrote:
>>>>> 
>>>>>> Forgive my ignorance/lack of experience
>>>>>> 
>>>>>> I am somewhat familiar with the ARRAY() function, but not sure I know
>>>>>> the term "correlated"
>>>>>> Searching the Calcite codebase for uses of "correlated" + "query", I
>>>>>> found:
>>>>>> 
>>>>>> 
>>>>>> 
>>> 
>> https://github.com/apache/calcite/blob/1d4f1b394bfdba03c5538017e12ab2431b137ca9/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1603-L1612
>>>>>> 
>>>>>>  @Test void testCorrelatedSubQueryInJoin() {
>>>>>>final String sql = "select *\n"
>>>>>>+ "from emp as e\n"
>>>>>>+ "join dept as d using (deptno)\n"
>>>>>>+ "where d.name = (\n"
>>>>>>+ "  select max(name)\n"
>>>>>>+ "  from dept as d2\n"
>>>>>>+ "  where d2.deptno = d.deptno)";
>>>>>>sql(sql).withExpand(false).ok();
>>>>>>  }
>>>>>> 
>>>>>> But I also see this, which says it is "uncorrelated" but seems very
>>>>>> similar?
>>>>>> 
>>>>>>  @Test void testInUncorrelatedSubQuery() {
>>>>>>final String sql = "select empno from emp where deptno in"
>>>>>>+ " (select deptno from dept)";
>>>>>>sql(sql).ok();
>>>>>>  }
>>>>>> 
>>>>>> I wouldn't blame you for not answering such a basic question -- but
>>> what
>>>>>> exactly does "correlation" mean here?
>>>>>> 
>>>>>> Thanks, as usual Julian
>>>>>> 
>>>>>> 
>>>>>> On Sat, Feb 12, 2022 at 3:08 PM Julian Hyde 
>>>>>> wrote:
>>>>>> 
>>>>>>> Correlated ARRAY sub-query?
>>>>>>> 
>>>>>>>> On Feb 12, 2022, at 10:40 AM, Gavi

Re: DDL support issues in Calcite

2022-02-22 Thread Julian Hyde
Looks like you need to be using ServerDdlExecutor rather than 
DdlExecutor.USELESS.

Therefore try using org.apache.calcite.server.ServerDdlExecutor#PARSER_FACTORY 
rather than org.apache.calcite.sql.parser.ddl.SqlDdlParserImpl#FACTORY in your 
connect string.


> On Feb 21, 2022, at 12:17 AM, wang...@mchz.com.cn wrote:
> 
> Dear Calcite community:
> 
> I am new Calcite user here. I noted in the document 
> https://calcite.apache.org/docs/adapter.html that Calcite does support DDL 
> operations,the only thing I need to do is 1) to include calcite-server.jar in 
> my classpath and 2) add 
> parserFactory=org.apache.calcite.sql.parser.ddl.SqlDdlParserImpl#FACTORY to 
> JDBC connect string. 
> Now the problem is: I did so, but still I got the following error:
> 
> 
> java.sql.SQLException: Error while executing SQL "CREATE TABLE t (i INTEGER, 
> j VARCHAR(10))": DDL not supported: CREATE TABLE `T` (`I` INTEGER, `J` 
> VARCHAR(10)) at 
> org.apache.calcite.avatica.Helper.createException(Helper.java:56) at 
> org.apache.calcite.avatica.Helper.createException(Helper.java:41) at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
>  at 
> org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
>  at com..CalciteResolverTest.testJdbc(CalciteResolverTest.java:259) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:568) at 
> junit.framework.TestCase.runTest(TestCase.java:177) at 
> junit.framework.TestCase.runBare(TestCase.java:142) at 
> junit.framework.TestResult$1.protect(TestResult.java:122) at 
> junit.framework.TestResult.runProtected(TestResult.java:142) at 
> junit.framework.TestResult.run(TestResult.java:125) at 
> junit.framework.TestCase.run(TestCase.java:130) at 
> junit.framework.TestSuite.runTest(TestSuite.java:241) at 
> junit.framework.TestSuite.run(TestSuite.java:236) at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:90) 
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:221)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) Caused by: 
> java.lang.UnsupportedOperationException: DDL not supported: CREATE TABLE `T` 
> (`I` INTEGER, `J` VARCHAR(10)) at 
> org.apache.calcite.server.DdlExecutor.lambda$static$0(DdlExecutor.java:28) at 
> org.apache.calcite.prepare.CalcitePrepareImpl.executeDdl(CalcitePrepareImpl.java:369)
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:634)
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:513)
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:483)
>  at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:249)
>  at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:623)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:674)
>  at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>  ... 20 more
> 
> did I miss any other properties that needs to be set? 
> 
> 
> 
> 
> Many thanks
> Sai Wang



Re: dynamic reflective schema

2022-02-22 Thread Julian Hyde
Every time I see "at Baz.bind” I am reminded that I should have chosen a better 
name for that class than ‘Baz’. :)

I figured ‘it’s generated, no one is ever going to know its name’. How wrong I 
was. The stack traces are all over the internet.

Julian


> On Feb 22, 2022, at 3:32 PM, Scott Reynolds  wrote:
> 
> Whenever I see "at Baz.bind(Unknown Source)" I am reminded how important it
> is, when debugging, to set the System Property calcite.debug = true
> https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java#L59
> 
> That will get you a good stack trace and will help you figure out the
> generated code is attempting to do.
> 
> On Tue, Feb 22, 2022 at 5:35 AM xiaobo  wrote:
> 
>> we have static data conbination use cases , reflective shcema is the
>> best choice!
>> 
>> 
>> 
>> ---Original---
>> From: "Gavin Ray"> Date: Mon, Feb 21, 2022 01:52 AM
>> To: "dev"> Subject: Re: dynamic reflective schema
>> 
>> 
>> Ah, you don't want to use ReflectiveSchema, it's a simple schema type meant
>> for easily making test schemas
>> 
>> You want to extend from "AbstractSchema" and override the function
>> "Map> For the "Table" type, you probably want to use "JsonScannableTable"
>> 
>> The CsvSchema example does exactly this, if you want to see an example
>> implementation:
>> 
>> https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvSchema.java#L69-L106
>> 
>> Hope this helps =)
>> 
>> 
>> 
>> On Sat, Feb 19, 2022 at 11:03 PM xiaobo > wrote:
>> 
>>  Hi,
>>  When using reflectiveSchema we must define a static Java class
>> for the
>>  schema object, but in our use case the actual data to query is
>> dynamic,
>>  define JAVA class for each data combination is impossible, we have an
>> idea
>>  that can we make a JSONSchema which accepts a Map>  List> , and each
>>  JSONObject mapped to a row, JSONObject keys will be column names. Is
>> there
>>  anything similar with this?
>> 
>> 
>>  Regards



Re: CyclicMetadataException

2022-02-17 Thread Julian Hyde
Planning is not thread-safe. For a given RelOptCluster, only one thread should 
be planning.

If there are multiple queries (each with their own RelOptCluster) then they can 
safely be planned on separate threads.

It’s possible that there are issues if queries are sharing materialized views. 
The RelNodes that define the materialized views are defined before the query is 
created, and therefore use a different RelOptCluster than the query, and 
potentially the same RelOptCluster as each other. If you think you are running 
into this, let’s dig deeper.

Julian
   

> On Feb 17, 2022, at 6:31 AM, Boyan Kolev  wrote:
> 
> Hello,
> 
> I'm coming across this exception when Calcite prepares queries in parallel.
> I see this happens if the same query is getting planned by two threads at
> the same time,
> but I also have the impression that this may also happen when the two
> queries being planned in parallel are not exactly the same but have common
> fragments of the logical plan.
> 
> Since this is hard to reproduce and debug as it happens rarely in a
> multithreaded setup,
> I'd like to have some workaround of the problem.
> 
> What would be a good practice?
> For example, if the exception is caught and then the planning re-attempted
> (a few times if needed),
> would that significantly decrease the probability of running into that
> exception?
> 
> Thanks in advance for any help!
> 
> --
> Boyan Kolev



Re: Pushing filters with unsafe operations (CAST/DIVISION) below inner joins

2022-02-17 Thread Julian Hyde
You should definitely log a bug, even if there are no plans to fix it. It is a 
violation of the standard, and therefore it is a bug, and therefore we should 
document that it exists.

Can you do some research on the right terminology. You use the term ‘unsafe’. I 
think the PL community uses the term ’strict’ for expressions that must be 
evaluated (and therefore if they throw, the query is sure to abort). We have 
other related concepts floating around in Calcite - dynamic functions, 
deterministic operators - and standardizing terminology is key if we are to 
keep the menagerie in order.

There might be a way to have our cake and eat it too. We could push down the 
division, and if there is a division-by-zero we return a special value such as 
NULL. If the special value is not eliminated by the join then it is promoted to 
a throw. The “cake” here is the performance benefit of pushing down a filter 
(that may never or rarely throw); the “eating it” is safety on the occasion 
that the filter does throw.

Even if that strategy doesn’t work, maybe we could have a flag that says 
whether to push down conditions that might throw.

Julian


> On Feb 17, 2022, at 8:07 AM, Scott Reynolds  wrote:
> 
> Is it feasible to prevent the filter push down in cases like this ( detect
> potential division by zero) ? What other exceptions that could happen?
> 
> Should it only push down some filters and preserve the complex filter above
> the join?
> 
> Regarding a Jira ticket, generating an exception when it should produce
> valid results is gotta be a Jira case IMHO.
> 
> 
> 
> 
> On Thu, Feb 17, 2022, 7:17 AM Stamatis Zampetakis  wrote:
> 
>> Hi all,
>> 
>> Till today, I believed that pushing a filter below an inner join is not
>> strictly related to the operators (equals, plus, minus, cast, division)
>> present in the filter.
>> 
>> However, the query below will return some results if the filter is executed
>> after the join or it will fail with an exception if the filter is pushed
>> below the join.
>> 
>> EMP [EMPNO, NAME, DEPTNO]
>> 0   | Alex | 0
>> 10 | Bob | 1
>> 
>> DEP [DEPTNO]
>> 1
>> 
>> SELECT e.name
>> FROM emp e
>> INNER JOIN dep d ON e.deptno = d.deptno
>> WHERE (10 / e.empno) = 1
>> 
>> It seems that SQL standard (Section 7.4 general rule 1) mandates that WHERE
>> should be applied to the result of FROM so in the case above pushing
>> filters below the join seems to violate the standard.
>> 
>> Currently, in Calcite we are going to push the division, cast, and
>> basically any kind of operator below the join. Many well-known databases
>> e.g., Postgres, do that as well.
>> 
>> Should we log it as a bug and try to do something about it or let it be as
>> it is?
>> 
>> Best,
>> Stamatis
>> 



Re: Identifier starting with number parse fail

2022-02-16 Thread Julian Hyde
+1 Babel is an appropriate place to support this. (It would not be practical to 
implement in parser config, because tokenization needs to be efficient. It is 
better done in a separate parser, which is what Babel has.) 

XuRenhe, Please log a feature request in Jira to support MySQL-style 
identifiers in Babel.

I know that you were talking about unquoted identifiers, but note that 
identifiers can start with numerics in Calcite and standard SQL provided that 
you quote them, e.g.

  SELECT deptno AS “12c3” FROM emp

Julian


> On Feb 16, 2022, at 3:02 AM, xiong duan  wrote:
> 
> I do some search in:
> 
> PostgreSQL: https://www.postgresql.org/docs/9.1/sql-syntax-lexical.html
> > don't support
> SqlServer:
> https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-ver15
> >don't support
> Oracle: https://docs.oracle.com/javadb/10.6.2.1/ref/crefsqlj1003454.html
> ->don't support
> 
> So If we need to support this. The Babel module is appropriate.
> 
> Ruben Q L  于2022年2月16日周三 18:31写道:
> 
>> Hello XuRenhe,
>> 
>> Please be aware that our dev list does not allow images.
>> 
>> Regarding your question, I found this on the SQL standard:
>> 
>> « An  is any character in the Unicode General Category
>> classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
>> NOTE 95 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
>> “Lo”, and “Nl” are assigned to Unicode characters that are, respectively,
>> upper-case letters, lower case letters, title-case letters, modifier
>> letters, other letters, and letter numbers. »
>> 
>> In case it helps, I found in Wikipedia a link to an older version of the
>> standard [1] which basically says the same thing (page 85).
>> 
>> To sum up, identifiers must start with a letter, so Calcite is respecting
>> the standard (and it seems MySql deviates from it).
>> 
>> I am not sure if this is something that could (or should) be achieved with
>> the Calcite SqlDialect mechanism; I am not very familiar with it, maybe
>> someone else can provide more information.
>> 
>> Best regards,
>> Ruben
>> 
>> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>> 
>> 
>> On Wed, Feb 16, 2022 at 8:18 AM 徐仁和  wrote:
>> 
>>> Hi community:
>>> 
>>> Parsing sql fails, when meeting the identifier starting with number.
>>> 
>>> Here, test case:
>>> // org.apache.calcite.test.SqlToRelConverterTest
>>> @Test void testIdentifierStartWithNumber() {
>>>  final String sql =
>>>  "select 1_c1 from (select deptno as 1_c1 from emp)";
>>>  sql(sql).ok();
>>> }
>>> 
>>> It throws an exception as below:
>>> java.lang.RuntimeException:
>>> org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at line
>> 1,
>>> column 36.
>>> Was expecting one of:
>>> ...
>>> ...
>>> ...
>>> ...
>>> ...
>>> ...
>>> ...
>>> ...
>>> 
>>> at
>>> 
>> org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)
>>> 
>>> *Mysql supports this case, should calcite support it?*
>>> 
>>> *Failed png in Calcite:*
>>> [image: image.png]
>>> 
>>> *Successful png in MySql:*
>>> [image: image.png]
>>> 
>>> 
>>> 
>>> Best XuRenhe
>>> 
>>> 
>> 



Re: Allow Cascades driver invoking "derive" on the nodes produced by "passThrough"

2022-02-15 Thread Julian Hyde
> So the idea of keeping even internal components of Calcite "more public" is 
> rather a good thing than the bad one from my point of view.

This can go two ways.

There is the type of change that I call “drill a hole” where someone requests 
that an implementation detail be made public so that they can use it, but they 
don’t provide much justification, and don’t provide a use case that might be 
useful to others.

The other is where someone converts the internal mechanism into a genuine 
feature, with a use case, documentation, and comprehensive tests that help 
others understand and use the feature.

The “drill a hole” changes are a net burden on the Calcite project because we 
have to maintain the implementation going forward.

I’m not going to make a judgment about whether this proposed change is 
positive. The key thing is to think of it as adding a feature, not just adding 
the word “public” in one or two source files.

Julian



Re: [DISCUSS] Towards Calcite 1.30.0

2022-02-15 Thread Julian Hyde
+1 to release 1.30 soon, to keep up the tempo. Feb 28th sounds good to
me. Thanks for pushing on this, Ruben.

Andrei, Are you available to be RM for 1.30? When is a good date for you?

Julian

On Mon, Feb 14, 2022 at 9:16 AM Ruben Q L  wrote:
>
> Hello,
>
> It has been almost two months since our last release [1], and I think we
> should make an effort to continue keeping the rhythm of one release every
> two months approximately.
>
> If I am not mistaken, the next release managers would be [2]:
> - Andrei Sereda for 1.30.0
> - Liya Fan for 1.31.0
>
> Are you guys still able to act as release managers for these versions?
> Do we have any other volunteers for the subsequent releases?
>
> As usual, according to our Jira dashboard [3] and Github [4], there are
> many pending issues that could / should be part of the release. What do you
> think would be a reasonable date to propose for a RC? Perhaps 28th
> February? I would like to let us have plenty of time to collectively review
> the PR candidates and "unblock" (to some extent) the PR situation (see also
> the recent separate thread on that matter [5]).
>
> Best regards,
> Ruben
>
>
> [1] https://calcite.apache.org/news/2021/12/25/release-1.29.0/
> [2] https://lists.apache.org/thread/kqslh8yqvxbolk74vov3ofw19h90xkp2
> [3]
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950
> [4] https://github.com/apache/calcite/pulls
> [5] https://lists.apache.org/thread/30pf1o0vlcn7y3bhlcht1wdmvmxyvghn


Re: Failed to import gradle

2022-02-15 Thread Julian Hyde
Thanks for the update!

> On Feb 14, 2022, at 11:46 PM, Aitozi  wrote:
> 
> Hi Julian
> 
>Sorry for the late response, I tried the solution posted by Gravin, The
> IDE can work with calcite project now. The changed files are
> 
>modified:   .idea/vcs.xml
>modified:   build.gradle.kts
>modified:   buildSrc/build.gradle.kts
>modified:   core/build.gradle.kts
>modified:   gradle.properties
>modified:   release/build.gradle.kts
> 
> I want to reproduce it by git stash and reopen the project, but it failed
> to reproduce.
> 
> The error log before is :
> 
> Unresolved reference: implementation :29
> Unresolved reference: implementation :30
> Unresolved reference: testimplementation:32
> Unresolved reference: testImplementation:33
> Unresolved reference: testimplementation:34
> Unresolved reference: testimplementation:35
> Unresolved reference: testimplementation:36
> Unresolved reference: testRuntime Only :38
> Unresolved reference: source Sets > Configure project :cassandra:68
> ...
> 
> So now the problem is magically disappeared. If I can reproduce again, I
> will create a ticket for it.
> 
> Thanks
> Aitozi
> 
> 
> Gavin Ray  于2022年2月9日周三 23:53写道:
> 
>> Submitted a PR and Jira issue for this =)
>> 
>> On Thu, Jan 27, 2022 at 6:44 PM Julian Hyde 
>> wrote:
>> 
>>> Can someone log a bug for this?
>>> 
>>> I don’t run into it personally. (I just click open on the directory. I
>>> never invoke ‘import gradle’.) But people shouldn’t have to edit their
>>> Gradle files.
>>> 
>>> 
>>> 
>>>> On Jan 27, 2022, at 6:55 AM, Gavin Ray  wrote:
>>>> 
>>>> It didn't work for me either.
>>>> 
>>>> I had to remove some of the Gradle tasks, related to checks IIRC.
>>>> What I did that made it work was:
>>>> 
>>>> 1. Modify "gradle.properties" in the root of the repo, adding the
>>> following
>>>> line:
>>>> 
>>>> skipAutostyle=true
>>>> 
>>>> 2. Modify every instance of
>>>>   "val skipAutostyle by props()"
>>>>   to
>>>>  "val skipAutostyle by props(true)"
>>>>   in the Gradle build files
>>>> 
>>>> (This line exists in "build.gradle.kts" and
>> "buildSrc/build.gradle.kts")
>>>> 
>>>> 3. Commented out the CheckStyle task in "core/build.gradle.kts"
>>>> 
>>>> //tasks.withType().configureEach {
>>>> //mustRunAfter(versionClass)
>>>> //mustRunAfter(javaCCMain)
>>>> //mustRunAfter(javaCCTest)
>>>> //}
>>>> 
>>>> 4. Commented out "releaseArtifacts" plugin in
>> "release/build.gradle.kts"
>>>> 
>>>> //releaseArtifacts {
>>>> //artifact(archiveTask)
>>>> //}
>>>> 
>>>> 
>>>> My git status looks like:
>>>> 
>>>> user@MSI:~/projects/calcite$ git status
>>>>   modified:   build.gradle.kts
>>>>   modified:   buildSrc/build.gradle.kts
>>>>   modified:   core/build.gradle.kts
>>>>   modified:   gradle.properties
>>>>   modified:   release/build.gradle.kts
>>>> 
>>>> Hope this helps =)
>>>> 
>>>> On Thu, Jan 27, 2022 at 4:03 AM Thomas Rebele
>> >>> 
>>>> wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> the Calcite mailing list removes pictures. Could you please post the
>>> error
>>>>> message in text form?
>>>>> 
>>>>> Cordialement / Best Regards,
>>>>> *Thomas Rebele, PhD* | R Developer | Germany | www.tibco.com
>>>>> 
>>>>> 
>>>>> On Thu, Jan 27, 2022 at 7:38 AM Aitozi  wrote:
>>>>> 
>>>>>> Hi community:
>>>>>>   I'm new to calcite community and gradle. When I followed the
>>> Howto.md
>>>>>> to import the calcite project, It always failed with the following
>>> error.
>>>>>> After some search in Google and calcite dev mail list, I stiil can't
>>> not
>>>>>> figure it out. Is there anyone can help me out? thanks in advance
>>>>>> 
>>>>>> [image: image.png]
>>>>>> The version of the intellij IDEA is 2021.1.3.
>>>>>> 
>>>>>> Atiozi.
>>>>>> 
>>>>> 
>>> 
>>> 
>> 



Re: Why are nested aggregations illegal? Best alternatives?

2022-02-12 Thread Julian Hyde
Correlated ARRAY sub-query?

> On Feb 12, 2022, at 10:40 AM, Gavin Ray  wrote:
> 
> Apologies for the delay in replying
> 
> This makes things clear and seems obvious now that you point it out.
> Thank you, Justin and Julian =)
> 
> Let me ask another question (if I may) that I am struggling to phrase
> easily.
> 
> So with GraphQL, you might have a query like:
> - "Get houses"
> - "For each house get the user that lives in the house
> - "And for each user get their list of todos"
> 
> The result has to come back such that it's a single object for each row
> ==
> {
>  houses: [{
>address: "123 Main Street",
>users: [{
>  name: "Joe",
>  todos: [{
>description: "Take out trash"
>  }]
>}]
> }
> 
> From a SQL perspective, the logical equivalent would be something like:
> ==
> SELECT
>house.address,
>(somehow nest users + double-nest todos under user)
> FROM
>house
> JOIN
>user ON user.house_id = house.id
>todos ON todos.user_id = user.id
> WHERE
>house.id = 1
> 
> I'm not familiar enough with SQL to have figured out a way to make this
> kind of
> query using operators that are supported across most of the DB's Calcite has
> adapters for.
> 
> Currently what I have done instead, on a tip from Gopalakrishna Holla from
> LinkedIn Coral team who has built GraphQL-on-Calcite, was to break up the
> query
> into individual parts and then do the join in-memory:
> 
> SELECT ... FROM users;
> SELECT ... FROM todos WHERE todos.user_id IN (ResultSet from prev response);
> 
> However, the way I am doing this seems like it's probably very inefficient.
> Because I do a series of nested loops to add the key to each object in the
> parent ResultSet row:
> 
> https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153
> 
> Is there some better way of doing this?
> I would be eternally grateful for any advice.
> 
> 
> On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde  wrote:
> 
>> Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) or
>> nested sub-queries. For example, the following is I believe valid standard
>> SQL, and actually computes something useful:
>> 
>>  WITH q1 AS
>>   (SELECT deptno, job, AVG(sal) AS avg_sal
>>FROM emp
>>GROUP BY deptno, job)
>>  WITH q2 AS
>>   (SELECT deptno, AVG(avg_sal) AS avg_avg_sal
>>FROM q1
>>GROUP BY deptno)
>>  SELECT AVG(avg_avg_sal)
>>  FROM q2
>>  GROUP BY ()
>> 
>> (You can omit the “GROUP BY ()” line, but I think it makes things clearer.)
>> 
>> Julian
>> 
>> 
>> 
>>> On Feb 10, 2022, at 12:17 PM, Justin Swanhart 
>> wrote:
>>> 
>>> I wish you could unsend emails :)  Answering my own question, no, because
>>> that would return three rows with the average :D
>>> 
>>> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart 
>> wrote:
>>> 
>>>> Just out of curiosity, is the second level aggregation using AVG in a
>>>> window context?  It the frame is the whole table and it aggregates over
>> it?
>>>> 
>>>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart 
>>>> wrote:
>>>> 
>>>>> That is really neat about Oracle.
>>>>> 
>>>>> The alternative in general is to use a subquery:
>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>>> becomes
>>>>> select avg(the_avg)
>>>>> from (select avg(sal) from emp group b deptno) an_alias;
>>>>> 
>>>>> or
>>>>> 
>>>>> with the_cte as (select avg(sal) x from emp group by deptno)
>>>>> select avg(x) from the_cte;
>>>>> 
>>>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde 
>>>>> wrote:
>>>>> 
>>>>>> Some databases, e.g. Oracle, allow TWO levels of nesting:
>>>>>> 
>>>>>> SELECT avg(sal) FROM emp GROUP BY deptno;
>>>>>> 
>>>>>> AVG(SAL)
>>>>>> 
>>>>>> 1,566.67
>>>>>> 2,175.00
>>>>>> 2,916.65
>>>>>> 
>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>>>> 
>>>>>> AVG(SUM(SAL))
>>>>>> =
>>>>>>   9,675
>>>>&g

Re: calcite multi-threading problem

2022-02-11 Thread Julian Hyde
xiaobo,

Your emails are coming through with a lot of HTML escape characters [1]. To 
make them more readable for everyone else, could you perhaps try sending in 
“Plain text” mode?

Julian

[1] https://lists.apache.org/thread/ttrlw4c441br27lppxl77osbgxrv58wn

> On Feb 11, 2022, at 5:36 PM, xiaobo  wrote:
> 
> can you share the solution for the problem, we are using the following 
> libraries, and it seems janino 3.1.6 is only dependent by calcite-core.
> 
> 
>  
> 
> 
> 
> 
>
>
>
> 
> 
> 
> 
>   
> 
> 
> 
> 
> 
> 
>  
> 
> 
>   
> 
> 
>   
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --Original--
> From: "xiong duan"; Send time:Saturday, Feb 12, 2022 8:43 AM
> To:"dev" 
> Subject: Re: calcite multi-threading problem
> 
> 
> 
> Yes, It is a dependency issue. I have met the same problem. +1 for Dmitry.
> 
> Dmitry Sysolyatin  
>  Actually, I had the same problem with spark. Spark 3.2.1 uses the old
>  version of janino, calcite 1.29 uses the new version of janino and they 
> are
>  not compatible.
>  If I downgrade janino version to version which spark uses I got an error
>  not method found exception
>  If I upgrade the janino version to the version which calcite uses I get 
> the
>  same error but with another method.
> 
>  As a result, I built a separate calcite jar package where I shaded janino
>  dependency
> 
> 
> 
>  On Fri, Feb 11, 2022 at 2:39 PM stanilovsky evgeny <
>  estanilovs...@gridgain.com wrote:
> 
>   for example on a current calcite branch:
>   grep janino gradle.properties
>   janino.version=3.1.6
>  
>  
>yes, we use drools in the same project,nbsp; and drools 
> uses janino
>too, is there a version list of janino which calcite supports?
>   
>   
>   
>   
>---Original---
>From: "Dmitry Sysolyatin"Date: Fri, Feb 11, 2022 18:30 PM
>To: "dev"Subject: Re: calcite multi-threading problem
>   
>   
>Hi!
>   
>  
>  
> Itnbsp;looksnbsp;likenbsp;anbsp;dependencynbsp;issue,nbsp;notnbsp;likenbsp;anbsp;multi-threadingnbsp;issue.nbsp;Most
>   
>  
>  
> likelynbsp;younbsp;havenbsp;somenbsp;dependencynbsp;thatnbsp;usenbsp;anothernbsp;versionnbsp;ofnbsp;janinonbsp;but
>calcitenbsp;requiresnbsp;anothernbsp;one
>   
>   
>  
>  
> Onnbsp;Fri,nbsp;Febnbsp;11,nbsp;2022nbsp;atnbsp;12:18nbsp;PMnbsp;xiaobonbsp;   gt;nbsp;wrote:
>   
>gt;nbsp;Hi,
>gt;
>gt;
>   
>  
>  
> gt;nbsp;wenbsp;opennbsp;anbsp;calcitenbsp;connectionnbsp;withnbsp;ReflectiveSchemanbsp;innbsp;onenbsp;thread,nbsp;and
>   
>  
>  
> gt;nbsp;executenbsp;sqlsnbsp;innbsp;othernbsp;nbsp;threadsnbsp;againstnbsp;thenbsp;connection,nbsp;butnbsp;failednbsp;with
>gt;nbsp;thisnbsp;error
>gt;
>gt;
>gt;
>gt;
>
> gt;nbsp;com.google.common.util.concurrent.ExecutionError:
>gt;nbsp;java.lang.NoSuchMethodError:
>gt;
>   
>  
>  
> gt;nbsp;org.codehaus.commons.compiler.CompilerFactoryFactory.getDefaultCompilerFactory(Ljava/lang/ClassLoader;)
>gt;
>
> gt;nbsp;Lorg/codehaus/commons/compiler/ICompilerFactory;
>gt;
>gt;
>   
>  
>  
> gt;nbsp;donbsp;wenbsp;missnbsp;anythingnbsp;regardingnbsp;tonbsp;multi-threadingnbsp;withnbsp;calcite?
>  
> 



Re: [DISCUSS] Assigning reviewers

2022-02-10 Thread Julian Hyde
I agree that automatic assignment is an imperfect tool. It’s difficult to know 
what functional area (let alone module) a bug or feature belongs to, and even 
then, who is the “owner” of that area. But it’s worth a try.

We have quite a good process going for release managers: people sign up in 
advance for a particular release (which I’d estimates takes between 1 and 3 
days, on average) and they deliver on their commitment.

Could we emulate that for reviewers? Suppose that six committers volunteer to 
review and merge up to six PRs per quarter. When they’ve done their six, we 
stop assigning them new ones. Reviewers can re-assign to another reviewer, but 
if you’re the assigned reviewer the ball is in your court to drive the PR to 
completion if the contributor makes any reasonable changes you ask for.

In addition, we should consider appointing a ‘controller’ who assigns PRs to 
reviewers when they arrive. We could rotate the controller so that no one 
serves for more than a few weeks.

Assigning people to tasks may seem antithetical to the spirit of open source. 
But it’s what effective software development teams do. It’s not creating more 
work, it’s just apportioning that work more fairly, and making things more 
predictable for our the people who depend on us. 

Julian
 

> On Feb 7, 2022, at 1:56 PM, Stamatis Zampetakis  wrote:
> 
> Hello,
> 
> I think the main problem is still the lack of active committers/reviewers
> and not so much the assignment.
> 
> Nevertheless, things would be a bit better if we had people assigned to
> PRs. This could certainly help at least offload some work from Julian and
> possibly sensibilize more people towards the reviewing process.
> 
> We could opt for assigning PRs automatically using filename patterns. This
> can be done rather easily and it's already done in various projects e.g.,
> Hive [1].
> 
> What do you think of putting automatic assignment in place for Calcite?
> Who is willing to put their name in the list? At this point I am not
> expecting that people in the list will review everything they are assigned
> on but maybe they can help out by pointing to other people or simply
> setting up the expectations about the PR getting reviewed.
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/hive/commit/2bd6a9d63c28e5cb9bcc44115262d565cdc2bb90
> 
> 
> On Sat, Feb 5, 2022 at 5:52 PM Jing Zhang  wrote:
> 
>> Hi, Julian
>> There is no doubt that you are the most prolific contributors on the
>> project.
>> People often hope to hear professional advice from you because you are the
>> Calcite project authority.
>> However people may didn't realize that you are suffering from more and more
>> requests. I want to sincerely apologize because I often disturb you too.
>> I'm really glad to hear your feelings.
>> At the same time, I want to thank you because I got many professional
>> advices from you. The suggestions are very helpful.
>> 
>> Back to this topic, having an efficient mechanism to merge contributors' PR
>> is very important to the long-term development of open source projects.
>> I would like to share my thoughts, hope it helps.
>> 1. It might helpful to know which members are proficient in which modules.
>> For example, introduce each member's familiar module on the website[1].
>> There may be many requests sent to other members.
>> 2. For some modules, perhaps only very few members are familiar. (For
>> example, when it comes to modifying the parser grammar, someone may want to
>> hear from Julian, and when it comes to hints, someone may want to hear from
>> Danny.) Maybe it's just my bias. But if this is the real situation, is it
>> possible to develop more than one members on each module?
>> 3. It could be helpful to assign reviewers for a new pull request.
>> 
>> [1] https://calcite.apache.org/community/#project-members
>> 
>> Best,
>> Jing Zhang
>> 
>> Julian Hyde  于2022年2月4日周五 02:18写道:
>> 
>>> I make many contributions to this project, in the form of code,
>>> answering questions, leading design discussions, and clarifying bugs
>>> and feature requests. I review more changes than any other project
>>> member. My reward is that I am pestered, daily, with people pleading
>>> for me to review their changes.
>>> 
>>> It's moderately annoying for me - I just delete the emails (because I
>>> have 200 other emails to delete before I can start productive work).
>>> But it's awful for both the contributors and the project.
>>> 
>>> Getting PRs reviewed is this project's biggest problem, and has been for
>>> years.
>>> 
>>> Many of you are team leads, engineering managers, directors of
>>> engineering. This is a process problem. Solving problems like this is
>>> what you do. Fix it.
>>> 
>>> Julian
>>> 
>> 



Re: Invalid IntervalSQLType

2022-02-10 Thread Julian Hyde
There are two kinds of intervals (in Calcite and in standard SQL). 
Days-hours-minutes-seconds-millseconds intervals, and year-month intervals. The 
former are represented internally in milliseconds. The latter are represented 
in months.

This is one area where Postgres does things differently from the SQL standard.


> On Feb 10, 2022, at 3:48 AM, Chathura Widanage  
> wrote:
> 
> Thanks, Stamatis. Below is the original SQL query.
> 
> select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty,
> sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1-l_discount))
> as sum_disc_price, sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as
> sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price,
> avg(l_discount) as avg_disc, count(*) as count_order from lineitem where
> l_shipdate <= date '1998-12-01' - interval '90' day group by l_returnflag,
> l_linestatus order by l_returnflag, l_linestatus
> 
> Based on the shared line of code, even a month should be represented in
> milis right? But when the below query is transformed it shows months in
> months.
> 
> with revenue (suplier_no, total_revenue) as ( select l_suppkey,
> sum(l_extendedprice * (1-l_discount)) from lineitem where l_shipdate >=
> date '1996-01-01' and l_shipdate < date '1996-01-01' + interval '3' month
> group by l_suppkey ) select s_suppkey, s_name, s_address, s_phone,
> total_revenue from supplier, revenue where s_suppkey = suplier_no and
> total_revenue = ( select max(total_revenue) from revenue ) order by
> s_suppkey
> 
> AND(>=($1, 1993-07-01 00:00:00), <($1, CAST(+(1993-07-01, 3:INTERVAL
> MONTH)):TIMESTAMP(0) NOT NULL))
> 
> Thanks for the pointer Stamatis, I'll see whether there is something to do
> with the RexSimplify/RexExecutor.
> 
> Regards,
> Chathura
> 
> 
> On Thu, Feb 10, 2022 at 5:07 PM Stamatis Zampetakis 
> wrote:
> 
>> Hi Chathura,
>> 
>> It is difficult to reason about correctness without having the actual SQL
>> query at hand.
>> 
>> The fact that you have milliseconds is not by itself a problem and has to
>> do with the way Calcite internally represents intervals (see comment in
>> [1]).
>> 
>> Also from the examples you provided the behavior in 1.29.0 does not seem to
>> be an additional transformation rather than a missing simplification
>> (constant reduction). I am not sure if this is intentional or not but I
>> guess you can have a look at the changes landed around
>> RexSimplify/RexExecutor.
>> 
>> Best,
>> Stamatis
>> 
>> [1]
>> 
>> https://github.com/apache/calcite/blob/812e3e98eae518cf85cd1b6b7f055fb96784a423/core/src/main/java/org/apache/calcite/rex/RexLiteral.java#L357
>> 
>> 
>> On Thu, Feb 10, 2022 at 8:02 AM Chathura Widanage <
>> chathurawidan...@gmail.com> wrote:
>> 
>>> Hi community,
>>> 
>>> I'm comparing two rel expressions generated by calcite 1.25.0 and 1.29.0
>>> and noticed there is an invalid IntervalSQLType plugged into the query.
>>> 
>>> <=($6, 1998-09-02 00:00:00) : Calcite 1.25.0
>>> vs
>>> <=($6, CAST(-(1998-12-01, 777600:INTERVAL DAY)):TIMESTAMP(0) NOT
>> NULL)
>>> :
>>> Calcite 1.29.0
>>> 
>>> 777600 is 90 days in milliseconds, but the IntervalSQLType/value
>>> combination is invalid.
>>> 
>>> Could you please let me know whether this could be a bug and whether
>> there
>>> an option to prevent such transformations at all?
>>> 
>>> Regards,
>>> Chathura
>>> 
>>> PS: This comes on queries from tpch benchmark and invalid conversion is
>>> from tpch-01.
>>> 
>>> I'm seeing similar conversions in other queries, but they seem to be
>>> correct, but feels this transformation is redundant.
>>> 
>>> AND(>=($1, 1993-07-01 00:00:00), <($1, 1993-10-01 00:00:00))
>>> vs
>>> AND(>=($1, 1993-07-01 00:00:00), <($1, CAST(+(1993-07-01, 3:INTERVAL
>>> MONTH)):TIMESTAMP(0) NOT NULL))
>>> 
>> 



Re: Why are nested aggregations illegal? Best alternatives?

2022-02-10 Thread Julian Hyde
Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) or 
nested sub-queries. For example, the following is I believe valid standard SQL, 
and actually computes something useful:

  WITH q1 AS
   (SELECT deptno, job, AVG(sal) AS avg_sal
FROM emp
GROUP BY deptno, job)
  WITH q2 AS
   (SELECT deptno, AVG(avg_sal) AS avg_avg_sal
FROM q1
GROUP BY deptno)
  SELECT AVG(avg_avg_sal)
  FROM q2
  GROUP BY ()

(You can omit the “GROUP BY ()” line, but I think it makes things clearer.)

Julian



> On Feb 10, 2022, at 12:17 PM, Justin Swanhart  wrote:
> 
> I wish you could unsend emails :)  Answering my own question, no, because
> that would return three rows with the average :D
> 
> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart  wrote:
> 
>> Just out of curiosity, is the second level aggregation using AVG in a
>> window context?  It the frame is the whole table and it aggregates over it?
>> 
>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart 
>> wrote:
>> 
>>> That is really neat about Oracle.
>>> 
>>> The alternative in general is to use a subquery:
>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>> becomes
>>> select avg(the_avg)
>>> from (select avg(sal) from emp group b deptno) an_alias;
>>> 
>>> or
>>> 
>>> with the_cte as (select avg(sal) x from emp group by deptno)
>>> select avg(x) from the_cte;
>>> 
>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde 
>>> wrote:
>>> 
>>>> Some databases, e.g. Oracle, allow TWO levels of nesting:
>>>> 
>>>> SELECT avg(sal) FROM emp GROUP BY deptno;
>>>> 
>>>> AVG(SAL)
>>>> 
>>>> 1,566.67
>>>> 2,175.00
>>>> 2,916.65
>>>> 
>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>> 
>>>> AVG(SUM(SAL))
>>>> =
>>>>9,675
>>>> 
>>>> The first level aggregates by department (returning 3 records), and the
>>>> second level computes the grand total (returning 1 record). But that is an
>>>> exceptional case.
>>>> 
>>>> Generally, any expression in the SELECT or HAVING clause of an aggregate
>>>> query is either ‘before’ or ‘after’ aggregation. Consider
>>>> 
>>>>  SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b
>>>>  FROM t
>>>>  GROUP BY t.x
>>>> 
>>>> The expressions “t.y” and “t.y + 3” occur before aggregation; “t.x”,
>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after aggregation.
>>>> SQL semantics rely heavily on this stratification. Allowing an extra level
>>>> of aggregation would mess it all up.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>> 
>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart 
>>>> wrote:
>>>>> 
>>>>> This is a SQL limitation.
>>>>> 
>>>>> mysql> select sum(1);
>>>>> ++
>>>>> | sum(1) |
>>>>> ++
>>>>> |  1 |
>>>>> ++
>>>>> 1 row in set (0.00 sec)
>>>>> 
>>>>> mysql> select sum(sum(1));
>>>>> ERROR  (HY000): Invalid use of group function
>>>>> 
>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray 
>>>> wrote:
>>>>> 
>>>>>> Went to test this query out and found that it can't be performed:
>>>>>> 
>>>>>> SELECT
>>>>>>   JSON_OBJECT(
>>>>>>   KEY 'users'
>>>>>>   VALUE JSON_ARRAYAGG(
>>>>>>   JSON_OBJECT(
>>>>>>   KEY 'name' VALUE "users"."name",
>>>>>>   KEY 'todos' VALUE JSON_ARRAYAGG(
>>>>>>   JSON_OBJECT(
>>>>>>   KEY 'description' VALUE "todos"."description"
>>>>>>   )
>>>>>>   )
>>>>>>   )
>>>>>>   )
>>>>>>   )
>>>>>> FROM
>>>>>>   "users"
>>>>>> LEFT OUTER JOIN
>>>>>>   "todos" ON "users"."id" = "todos"."user_id";
>>>>>> 
>>>>>> Checking the source, seems this is a blanket policy, not a
>>>>>> datasource-specific thing.
>>>>>> From a functional perspective, it doesn't feel like it's much
>>>> different
>>>>>> from JOINs
>>>>>> But I don't understand relational theory or DB functionality in the
>>>> least,
>>>>>> so I'm not fit to judge.
>>>>>> 
>>>>>> Just curious why Calcite doesn't allow this
>>>>>> 
>>>> 
>>>> 



Re: ARRAY_CONCAT does not work

2022-02-10 Thread Julian Hyde
Let’s continue conversation in 
https://issues.apache.org/jira/browse/CALCITE-4999 
. I already pushed back 
there on the idea that ARRAY_CONCAT’s type derivation was wrong, and Dmitry 
conceded.


> On Feb 10, 2022, at 5:08 AM, Michael Mior  wrote:
> 
> Exactly. I understand the problem is not CHAR(1) vs CHAR(7), but the record
> type. That is the point I was trying to make.
> --
> Michael Mior
> mm...@apache.org
> 
> 
> Le jeu. 10 févr. 2022 à 07:47, Dmitry Sysolyatin 
> a écrit :
> 
>> Michael, the problem is not because CHAR(1) and CHAR(7). Calcite can derive
>> common type in this case = CHAR(7) and all will work ok.
>> 
>> The problem is that one type is [ ARRAY] and another [
>> ARRAY]. I see two options for resolving this problem:
>> 
>> 1. Allow casting scalar type to RecordType with one field. I described it
>> inside https://issues.apache.org/jira/browse/CALCITE-4999
>> 2. Modify `ARRAY` function in the way that it will return ARRAY of scalar
>> if subquery is used as argument. Like Postgres does -
>> 
>> https://www.postgresql.org/docs/14/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS
>> 
>> On Thu, Feb 10, 2022 at 2:02 PM Michael Mior  wrote:
>> 
>>> The two types in your example are incompatible. One is an array of
>> CHAR(1).
>>> The other is an array of records, each with a single CHAR(7) field.
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> Le ven. 4 févr. 2022 à 11:27, Dmitry Sysolyatin >> 
>>> a
>>> écrit :
>>> 
 So, the previous case started to work. But I faced with another issue.
>>> This
 query does not work:
 
 ```
 "SELECT ARRAY_CONCAT(ARRAY['1', '2'], array(select 'toast.' || x from
 unnest(ARRAY['1','2']) x))"
 ```
 
 ```
 java.lang.IllegalArgumentException: Cannot infer return type for
 ARRAY_CONCAT; operand types: [CHAR(1) ARRAY, RecordType(CHAR(7) EXPR$0)
 ARRAY]
 ```
 
 I created an issue -
>> https://issues.apache.org/jira/browse/CALCITE-4999
 and
 PR - https://github.com/apache/calcite/pull/2712
 
 
 On Fri, Feb 4, 2022 at 2:51 PM Dmitry Sysolyatin <
>>> dm.sysolya...@gmail.com>
 wrote:
 
> Actually,
> 
>>> `SqlParser.Config.DEFAULT.withConformance(SqlConformanceEnum.BIG_QUERY)`
> helped, thanks !
> I will try to understand why it doesn't work with BABEL conformance
> 
> On Fri, Feb 4, 2022 at 2:35 PM Dmitry Sysolyatin <
 dm.sysolya...@gmail.com>
> wrote:
> 
>> How SqlParser config can help? If I understood correctly, parser
>>> config
>> affects only parser.parse stage. But code fails at the validation
>>> phase
>> 
>> Also, I would like to use ARRAY_CONCAT function with postgres
>> dialect.
 My
>> parser config at the moment:
>> 
>> ```
>> val parserConfig: SqlParser.Config =
>> PostgresqlSqlDialect.DEFAULT
>> .configureParser(SqlParser
>> .config()
>> // It is needed in order to parse PG "!~" operator
>> .withParserFactory(SqlBabelParserImpl.FACTORY)
>> )
>> .withConformance(SqlConformanceEnum.BABEL)
>> ```
>> 
>> On Fri, Feb 4, 2022 at 2:26 PM Thomas Rebele
>>>  
>> wrote:
>> 
>>> Hello,
>>> 
>>> SqlParserTest might help you. It has some checks related to
>>> BIG_QUERY.
>>> Maybe a
>>> 
>>> 
 
>>> 
>> .parserConfig(SqlParser.Config.DEFAULT.withConformance(SqlConformanceEnum.BIG_QUERY))
>>> could fix the problem.
>>> 
>>> Cordialement / Best Regards,
>>> *Thomas Rebele, PhD* | R Developer | Germany | www.tibco.com
>>> 
>>> 
>>> On Fri, Feb 4, 2022 at 12:57 PM Dmitry Sysolyatin <
>>> dm.sysolya...@gmail.com>
>>> wrote:
>>> 
 Hi!
 I have a problem with ARRAY_CONCAT operator.
 
 The following code failed with the exception ''No match found for
>>> function
 signature array_concat(, )".
 ```
 val config = Frameworks.newConfigBuilder()
 .parserConfig(parserConfig)
 .defaultSchema(CalciteSchema.createRootSchema(false,
>>> false).plus())
  .operatorTable(SqlOperatorTables.chain(
  SqlStdOperatorTable.instance(),
 
 SqlLibraryOperatorTableFactory.INSTANCE.getOperatorTable(
SqlLibrary.BIG_QUERY
   )))
 .programs(Programs.standard())
 .build()
 val planner = Frameworks.getPlanner(config)
 
 val query = "SELECT ARRAY_CONCAT(ARRAY[1, 2], ARRAY[2, 3])"
 val parsedSqlNode = planner.parse(query)
 planner.validate(parsedSqlNode)
 ```
 
 Am I doing something wrong or it is a bug?
 
>>> 
>> 
 
>>> 
>> 



Re: Why are nested aggregations illegal? Best alternatives?

2022-02-10 Thread Julian Hyde
Some databases, e.g. Oracle, allow TWO levels of nesting:

 SELECT avg(sal) FROM emp GROUP BY deptno;

 AVG(SAL)
 
 1,566.67
 2,175.00
 2,916.65

SELECT avg(avg(sal)) FROM emp GROUP BY deptno;

AVG(SUM(SAL))
=
9,675

The first level aggregates by department (returning 3 records), and the second 
level computes the grand total (returning 1 record). But that is an exceptional 
case.

Generally, any expression in the SELECT or HAVING clause of an aggregate query 
is either ‘before’ or ‘after’ aggregation. Consider

  SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b 
  FROM t
  GROUP BY t.x

The expressions “t.y” and “t.y + 3” occur before aggregation; “t.x”, “t.x + 1”, 
“SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after aggregation. SQL semantics 
rely heavily on this stratification. Allowing an extra level of aggregation 
would mess it all up.

Julian



> On Feb 10, 2022, at 9:45 AM, Justin Swanhart  wrote:
> 
> This is a SQL limitation.
> 
> mysql> select sum(1);
> ++
> | sum(1) |
> ++
> |  1 |
> ++
> 1 row in set (0.00 sec)
> 
> mysql> select sum(sum(1));
> ERROR  (HY000): Invalid use of group function
> 
> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray  wrote:
> 
>> Went to test this query out and found that it can't be performed:
>> 
>> SELECT
>>JSON_OBJECT(
>>KEY 'users'
>>VALUE JSON_ARRAYAGG(
>>JSON_OBJECT(
>>KEY 'name' VALUE "users"."name",
>>KEY 'todos' VALUE JSON_ARRAYAGG(
>>JSON_OBJECT(
>>KEY 'description' VALUE "todos"."description"
>>)
>>)
>>)
>>)
>>)
>> FROM
>>"users"
>> LEFT OUTER JOIN
>>"todos" ON "users"."id" = "todos"."user_id";
>> 
>> Checking the source, seems this is a blanket policy, not a
>> datasource-specific thing.
>> From a functional perspective, it doesn't feel like it's much different
>> from JOINs
>> But I don't understand relational theory or DB functionality in the least,
>> so I'm not fit to judge.
>> 
>> Just curious why Calcite doesn't allow this
>> 



Re: Prevent RexSimplify from simplifying operators to SEARCH

2022-02-08 Thread Julian Hyde
It is not possible today, but it is a reasonable feature request.

One caveat. If conversion to SEARCH is disabled then many simplifications will 
not occur.

Julian


> On Feb 8, 2022, at 12:38 AM, Chathura  wrote:
> 
> Hello all,
> 
> Can we configure RexSimplify to exclude certain operators such as SEARCH? The 
> requirement to prevent the builtin rules and custom rules from converting 
> "t1.age>=1 AND t1.age<5" to AND(<=($0, 1), >($0, 5)) and then convert it to 
> SEARCH during the RexSimplify#simplifyXX call.
> 
> Regards,
> Chathura
> 



Re: Using TIMESTAMPDIFF() with RelBuilder

2022-02-04 Thread Julian Hyde
We don’t really want TIMESTAMPDIFF to work because it’s not standard (although 
it is exists in various dialects).

Datetime minus is preferred. It’s more powerful because it accepts an interval 
value. I haven’t tried that either, but I imagine you would create an interval 
literal and then pass the result to the ‘-‘ operator.

Combining convertlets with RelBuilder is an interesting idea. If you can get 
that working without adding too much cruft to RelBuilder we should consider it. 
(I have a very low bar for cruft in RelBuilder because it’s massive - 5,000 
lines long - and, as a builder for the core Rel and Rex languages, there should 
only be one way to do things.)

Julian



> On Feb 2, 2022, at 3:07 PM, Thomas D'Silva  wrote:
> 
> Hi,
> 
> I get an error while using TIMESTAMPDIFF with RelBuilder. For this example
> code
> 
>  @Test void testRun2() throws Exception {
>// Equivalent SQL:
>//   SELECT TIMESTAMP(SECOND, HIREDATE, TIMESTAMP'1970-01-01') FROM EMP
>final RelBuilder builder = RelBuilder.create(config().build());
>RelNode root =
>builder.scan("EMP")
>.project(
>builder.getRexBuilder().makeCall(
>SqlStdOperatorTable.TIMESTAMP_DIFF,
>builder.getRexBuilder().makeFlag(TimeUnit.SECOND),
>builder.field("HIREDATE"),
> 
> builder.getRexBuilder().makeTimestampLiteral(TimestampString.fromMillisSinceEpoch(0),
> 0)
>)
>)
>.build();
> 
>try (PreparedStatement preparedStatement = RelRunners.run(root)) {
>  String s = CalciteAssert.toString(preparedStatement.executeQuery());
>  final String result = "";
>  assertThat(s, is(result));
>}
>  }
> 
> I get an exception
> 
> Suppressed: java.lang.RuntimeException: cannot translate call
> TIMESTAMPDIFF($t8, $t4, $t9)
> at
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:1157)
> at
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:98)
> 
> When a sql string is executed it appears that the TIMESTAMPDIFF function is
> converted to a difference of two timestamps (
> https://github.com/apache/calcite/blob/a03586c26e1888daebabb271b603fd2871d6a359/core/src/main/java/org/apache/calcite/sql2rel/StandardConvertletTable.java#L1853
> ).
> 
> Is there a way to use the TIMESTAMPDIFF() with RelBuilder or would I have
> to do a similar translation as is done in TimestampDiffConvertlet?
> 
> Thank you,
> Thomas



[DISCUSS] Assigning reviewers

2022-02-03 Thread Julian Hyde
I make many contributions to this project, in the form of code,
answering questions, leading design discussions, and clarifying bugs
and feature requests. I review more changes than any other project
member. My reward is that I am pestered, daily, with people pleading
for me to review their changes.

It's moderately annoying for me - I just delete the emails (because I
have 200 other emails to delete before I can start productive work).
But it's awful for both the contributors and the project.

Getting PRs reviewed is this project's biggest problem, and has been for years.

Many of you are team leads, engineering managers, directors of
engineering. This is a process problem. Solving problems like this is
what you do. Fix it.

Julian


Re: MaterializedViewRule is not applied for queries on aggregate views that truncate timestamp

2022-02-03 Thread Julian Hyde
These kinds of roll ups are very common. They have simple mathematical 
properties. There is a simplifying abstraction to be found here. Please find it 
and use it. 

Julian

> On Feb 2, 2022, at 7:47 PM, Thomas D'Silva  wrote:
> 
> Hi Stamatis,
> 
> Thanks for your response. Your suggestion of having use case specific rules
> that translate time ranges that align on the rollup of the materialized
> view makes sense.
> However if the time range doesn't align for eg. if we have a view that
> rolls up to the minute and the query predicate is "WHERE  ts >=
> timestamp'2018-01-01 00:01:30' AND ts < timestamp'2018-01-01 00:05:00'" we
> wouldn't be able to rewrite it to apply the FLOOR() on the left side of the
> comparators.
> In this case I was trying to use the MaterializedViewRule's union rewriting
> feature to query the view for the range "ts >= timestamp'2018-01-01
> 00:02:00' AND ts < timestamp'2018-01-01 00:05:00'" and query the table for
> the range "ts >= timestamp'2018-01-01 00:01:30' AND ts <
> timestamp'2018-01-01 00:02:00'". I would appreciate any suggestions on how
> to do this.
> 
> Thank you,
> Thomas
> 
> 
>> On Wed, Feb 2, 2022 at 3:28 AM Stamatis Zampetakis 
>> wrote:
>> 
>> Hi Thomas,
>> 
>> I haven't looked into the implementation details of the rule but I assume
>> the rewriting does not kick in because there is no easy way to relate the
>> "ts" column that is used in the query with the "floor(ts to minute)" column
>> that is materialized.
>> 
>> I assume that if your query is/was similar to the one below you would get a
>> rewriting.
>> 
>> SELECT   floor(ts to minute), count(*)
>> FROM events
>> WHERE   floor(ts to minute) >= timestamp'2018-01-01 00:01:00' AND floor(ts
>> to minute) < timestamp'2018-01-01 00:05:00'
>> GROUP BY eventid, floor(ts TO minute)
>> 
>> So far I don't see a limitation/problem with the MaterializedViewRule but
>> rather use-case specific rules you imply they always hold.
>> 
>> If I understand correctly you say that the following are kind of
>> equivalent:
>> 
>> WHERE ts >= timestamp'2018-01-01 00:01:00' AND ts <
>> timestamp'2018-01-01 00:05:00'
>> WHERE floor(ts to minute) >= timestamp'2018-01-01 00:01:00' AND floor(ts to
>> minute) < timestamp'2018-01-01 00:05:00'
>> WHERE floor(ts to minute) >= floor(timestamp'2018-01-01 00:01:00' to
>> minute) AND floor(ts to minute) < floor(timestamp'2018-01-01 00:05:00' to
>> minute)
>> 
>> and I think they are based on the given timestamp literals.
>> 
>> Maybe instead of changing the rules performing the view rewriting what you
>> should do is introduce other use-case specific rules that can be applied to
>> the input queries and bring them to the desired form.
>> 
>> Best,
>> Stamatis
>> 
>> On Fri, Jan 28, 2022 at 10:12 PM Thomas D'Silva >> 
>> wrote:
>> 
>>> Xurenhe,
>>> 
>>> Thank you for the detailed response and pointers. The use case you
>>> describe for OLAP streaming queries is similar to what our team is trying
>>> to accomplish.
>>> We use calcite to query data stored in kudu [1]. We use kafka streams to
>>> maintain materialized views that contain pre-aggregated data that rolls
>>> data up to the daily or hourly granularity. One of the problems we face
>> is
>>> when a user submits a query where the time range does not align to the
>> day
>>> or hour boundary.  In this case we are unable to use the materialized
>> views
>>> and have to query the raw event which is not performant because the table
>>> contains many more rows compared to the view.
>>> The reason I wanted to modify the MaterializedViewRule is to use the
>> "Union
>>> rewriting with aggregate" ability [2] this would allow us to use both the
>>> view and the table to answer such queries in the most efficient manner. I
>>> looked at the code in SubstitutionVisitor but I don't think there are
>>> existing rules there that do union rewriting with predicates.
>>> 
>>> The following is an example of the query we are trying to optimize. For
>> the
>>> materialized view
>>> SELECT   eventid,  floor(ts to minute), count(*) as cnt
>>> FROM events
>>> GROUP BY eventid, floor(ts TO minute)
>>> 
>>> and query
>>> SELECT floor(ts to minute), count(*)
>>> FROM events
>>> WHEREts >= timestamp'2018-01-01 00:01:30' AND ts <
>> timestamp'2018-01-01
>>> 00:05:00'
>>> GROUP BY eventid, floor(ts TO minute)
>>> 
>>> We would like to generate a plan that queries the table for the time
>> range
>>> [2018-01-01 00:01:30, 2018-01-01 00:02:00) and view for the time range
>>> [2018-01-01 00:02:-0, 2018-01-01 00:05:00) similar to
>>> 
>>> EnumerableCalc(expr#0..2=[{inputs}], EXPR$0=[$t1], EXPR$1=[$t2])
>>>  EnumerableAggregate(group=[{0, 1}], EXPR$1=[$SUM0($2)])
>>>EnumerableUnion(all=[true])
>>>  EnumerableAggregate(group=[{0, 1}], EXPR$1=[COUNT()])
>>>EnumerableCalc(expr#0..1=[{inputs}], expr#2=[FLAG(MINUTE)],
>>> expr#3=[FLOOR($t1, $t2)], expr#4=[Sarg[[2018-01-01 00:01:30..2018-01-01
>>> 00:02:00)]], expr#5=[SEARCH($t1, $t4)], 

Re: Simplification of "x != x"

2022-02-03 Thread Julian Hyde
Using SQL syntax (which is less confusing, since we are talking about shortcut 
evaluation)

  UNKNOWN AND (x IS NULL)

is equivalent to

 (x IS NULL) AND UNKNOWN

Just write out the truth table. If you get different results, are you perhaps 
incorrectly using some kind of shortcut evaluation?

Julian

> On Feb 3, 2022, at 08:29, Abhishek Agarwal  
> wrote:
> 
> Hello,
> While debugging a druid query that involves an operation such as "A <> A",
> I found out that in RexSimplify, such an expression is converted to "null
> && isnull(A)". That seems wrong since it will always evaluate to null even
> if A is not null in some row. The right conversion should be "isnull(A) &&
> null" that will return "false" is A is not null and "null" (aka unknown) is
> A is null.
> 
> is there a way to change this behavior without code change?


Re: Understanding UDF annotations

2022-02-02 Thread Julian Hyde
Yes, the UDF annotations are intended to allow better optimizations.

I’m sure there are bugs. If a query with a UDF is not being optimized, log a 
bug. The fix is probably at the level of a particular RelOptRule, and therefore 
the test case would be in RelOptRulesTest.

But I don’t think that each rule needs to look for annotations. The rules 
should* use methods on the RexNode (expression). If the expression is (or 
contains) a call to a UDF then the RexCall to that UDF will use the annotations.

* By ’should’ I mean ‘in an ideal world’, not ‘I believe that it is currently 
the case that…’.

I see that ReduceExpressionsRule has an inner class ReducibleExprLocator which 
uses calls such as SqlOperator.isDynamicFunction(). That functionality should 
(again, in an ideal world) be moved out of ReduceExpressionsRule in a way that 
other code can use it. I don’t know how.

For years I have been begging for someone to take ownership of this issue. This 
would include rigorous definitions of the terms ‘dynamic’ and ‘deterministic’. 
See https://issues.apache.org/jira/browse/CALCITE-4424 
 and 
https://issues.apache.org/jira/browse/CALCITE-2823 
. Perhaps also ‘pure’ and 
’strict’ (which are accepted terms in the programming language theory 
community).

Julian



> On Feb 1, 2022, at 8:49 AM, X L  wrote:
> 
> Hi,
> 
> I was wondering if my understanding about UDFs' annotations in
> Calcite is correct. For instance, I notice there is a `@Deterministic`
> label to specify a deterministic UDF. Then, relevant query rewrite rules
> (e.g., ReduceExpressionsRule) need to check the existence of this label
> (e.g., isDeterministic()) in order to decide whether the rule is applicable
> (without changing the query semantics).
> I do see that is a way to leverage annotations when optimizing queries that
> contain UDFs, and it seems to require adding such annotation checks in
> almost every query rewrite rules.
> 
> Thanks and Regards,
> Xinyu



Re: Getting child Expressions on latest version of Calcite.

2022-02-02 Thread Julian Hyde
You don’t need to override RexShuttle.apply(List) (and in fact I don’t 
know whether it will be called). Create your own shuttle and override the 
methods for the sub-classes of RexNode that are of interest, e.g. the following 
prints all input refs

  RelNode r;
  r.accept(
  new RexShuttle() {
@Override RexNode visitInputRef(RexInputRef ref) {
  System.out.println(“input ref “ + ref);
}
  });

One problem of that approach is that it doesn’t give you the top-level 
expression or expressions, it goes deep.

If you know you are looking at a particular kind of operator (e.g. Project, 
Filter, Join, Sort), call the particular method that returns the expressions 
(Project.getProjects(), Filter.getCondition(), etc.).

Julian


> On Feb 2, 2022, at 7:09 AM, Chathura Widanage  
> wrote:
> 
> Hi,
> 
> I'm upgrading a set of custom rules from 1.23.0 to the latest. We have a
> custom rule that uses org.apache.calcite.rel.RelNode#getChildExps and seems
> it's deprecated in latest versions. Although it suggests to use
> org.apache.calcite.rel.RelNode#accept(org.apache.calcite.rex.RexShuttle), I
> couldn't figureout a way to get a List out since
> org.apache.calcite.rex.RexShuttle#apply(java.util.List) is final. Could
> you please let me know how I can replace
> org.apache.calcite.rel.RelNode#getChildExps?
> 
> Regards,
> Chathura



Re: Question about creating a SqlBasicCall

2022-01-29 Thread Julian Hyde
Logically, your operator simply has three arguments. How the call with those 
arguments is converted back into SQL - whether it generates a comma or a space 
between arguments 1 and 2 - is a matter of syntax. It looks like your operator 
is unparsing calls using the default operator syntax, which uses commas, and 
you should override how it unparses itself.  

> On Jan 29, 2022, at 1:28 AM, Zhe Hu  wrote:
> 
> Hi, community.
> 
> 
> I have one small question about how to create a SqlBasicCall in Parser.jj 
> correctly.
> Take TimestampAddFunctionCall as an example, we create it by:
>SqlStdOperatorTable.TIMESTAMP_ADD.createCall(s.end(this), args)
> As a result, call like “TIMESTAMPADD(arg0, arg1, arg2)” will be formed.
> 
> 
> Here is my question:
> How does the SqlParser assemble the operands with COMMA correctly?
> Since the result is not “TIMESTAMPADD(arg0 arg1, arg2)”, which loses the 
> former COMMA; or “TIMESTAMPADD(arg0 arg1, arg2)”、“TIMESTAMPADD(arg0 arg1 
> arg2)”.
> 
> 
> I’ve noticed that SqlParserPos is involved, but I’m not sure if it’s relevant.
> 
> 
> Any feedback would be awesome.
> 
> 
> Best,
> ZheHu



Re: Calcite can't generate code for one of the query

2022-01-28 Thread Julian Hyde
I guess you’re using ReflectiveSchema. I don’t think we support columns of 
(Java) type Character or char. You should convert relkind to a String and I 
think things will be better.

> On Jan 28, 2022, at 4:44 AM, Dmitry Sysolyatin  
> wrote:
> 
> I found a way how to make the query work. But I don't understand the reason:
> If `relkind` is JavaType(char) (primitive type) then the query does not work
> If `relkind` is JavaType(java.lang.Character) then the query works.
> 
> 
> On Fri, Jan 28, 2022 at 12:59 PM Dmitry Sysolyatin 
> wrote:
> 
>> Hi!
>> 
>> I am implementing a wrapper over calcite in order to use it like Postgres
>> server. But I have a problem with one of a query that Postgres driver sends
>> to the server.
>> 
>> The query:
>> 
>> SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS
>> TABLE_NAME,  CASE n.nspname ~ '^pg_' OR n.nspname = 'information_schema'
>> WHEN true THEN CASE  WHEN n.nspname = 'pg_catalog' OR n.nspname =
>> 'information_schema' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TABLE'
>> WHEN 'v' THEN 'SYSTEM VIEW'   WHEN 'i' THEN 'SYSTEM INDEX'   ELSE NULL
>> END  WHEN n.nspname = 'pg_toast' THEN CASE c.relkind   WHEN 'r' THEN
>> 'SYSTEM TOAST TABLE'   WHEN 'i' THEN 'SYSTEM TOAST INDEX'   ELSE NULL   END
>> ELSE CASE c.relkind   WHEN 'r' THEN 'TEMPORARY TABLE'   WHEN 'p' THEN
>> 'TEMPORARY TABLE'   WHEN 'i' THEN 'TEMPORARY INDEX'   WHEN 'S' THEN
>> 'TEMPORARY SEQUENCE'   WHEN 'v' THEN 'TEMPORARY VIEW'   ELSE NULL   END
>> END  WHEN false THEN CASE c.relkind  WHEN 'r' THEN 'TABLE'  WHEN 'p' THEN
>> 'PARTITIONED TABLE'  WHEN 'i' THEN 'INDEX'  WHEN 'S' THEN 'SEQUENCE'  WHEN
>> 'v' THEN 'VIEW'  WHEN 'c' THEN 'TYPE'  WHEN 'f' THEN 'FOREIGN TABLE'  WHEN
>> 'm' THEN 'MATERIALIZED VIEW'  ELSE NULL  END  ELSE NULL  END  AS
>> TABLE_TYPE, d.description AS REMARKS,  '' as TYPE_CAT, '' as TYPE_SCHEM, ''
>> as TYPE_NAME, '' AS SELF_REFERENCING_COL_NAME, '' AS REF_GENERATION  FROM
>> pg_catalog.pg_namespace n, pg_catalog.pg_class c  LEFT JOIN
>> pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0)  LEFT
>> JOIN pg_catalog.pg_class dc ON (d.classoid=dc.oid AND
>> dc.relname='pg_class')  LEFT JOIN pg_catalog.pg_namespace dn ON
>> (dn.oid=dc.relnamespace AND dn.nspname='pg_catalog')  WHERE c.relnamespace
>> = n.oid  ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME
>> ```
>> 
>> The problem is that calcite can't generate code for WHEN CASE block (The
>> query works if remove WHEN CASE block).
>> 
>> ```
>> Caused by: java.lang.NoSuchMethodException:
>> org.apache.calcite.runtime.SqlFunctions.toChar(java.lang.Object)
>> ```
>> 
>> I wrote down the full exception description inside gist:
>> https://gist.github.com/dssysolyatin/168c97f6033dd68b96822ce0ab48f84c
>> 
>> Can someone point me to how to resolve this issue?
>> 



Re: Failed to import gradle

2022-01-27 Thread Julian Hyde
Can someone log a bug for this?

I don’t run into it personally. (I just click open on the directory. I never 
invoke ‘import gradle’.) But people shouldn’t have to edit their Gradle files.



> On Jan 27, 2022, at 6:55 AM, Gavin Ray  wrote:
> 
> It didn't work for me either.
> 
> I had to remove some of the Gradle tasks, related to checks IIRC.
> What I did that made it work was:
> 
> 1. Modify "gradle.properties" in the root of the repo, adding the following
> line:
> 
> skipAutostyle=true
> 
> 2. Modify every instance of
>"val skipAutostyle by props()"
>to
>   "val skipAutostyle by props(true)"
>in the Gradle build files
> 
> (This line exists in "build.gradle.kts" and "buildSrc/build.gradle.kts")
> 
> 3. Commented out the CheckStyle task in "core/build.gradle.kts"
> 
> //tasks.withType().configureEach {
> //mustRunAfter(versionClass)
> //mustRunAfter(javaCCMain)
> //mustRunAfter(javaCCTest)
> //}
> 
> 4. Commented out "releaseArtifacts" plugin in "release/build.gradle.kts"
> 
> //releaseArtifacts {
> //artifact(archiveTask)
> //}
> 
> 
> My git status looks like:
> 
> user@MSI:~/projects/calcite$ git status
>modified:   build.gradle.kts
>modified:   buildSrc/build.gradle.kts
>modified:   core/build.gradle.kts
>modified:   gradle.properties
>modified:   release/build.gradle.kts
> 
> Hope this helps =)
> 
> On Thu, Jan 27, 2022 at 4:03 AM Thomas Rebele 
> wrote:
> 
>> Hello,
>> 
>> the Calcite mailing list removes pictures. Could you please post the error
>> message in text form?
>> 
>> Cordialement / Best Regards,
>> *Thomas Rebele, PhD* | R Developer | Germany | www.tibco.com
>> 
>> 
>> On Thu, Jan 27, 2022 at 7:38 AM Aitozi  wrote:
>> 
>>> Hi community:
>>>I'm new to calcite community and gradle. When I followed the Howto.md
>>> to import the calcite project, It always failed with the following error.
>>> After some search in Google and calcite dev mail list, I stiil can't not
>>> figure it out. Is there anyone can help me out? thanks in advance
>>> 
>>> [image: image.png]
>>> The version of the intellij IDEA is 2021.1.3.
>>> 
>>> Atiozi.
>>> 
>> 



Re: How to handle the fact that different databases have different ideas of what a "db"/"schema" are?

2022-01-27 Thread Julian Hyde
Let’s not worry about the names, and say that some DBs have two namespace 
levels and others have just one.

Calcite’s word for a namespace is ’schema’. Calcite schemas are arranged in a 
hierarchy, like a filesystem, so there is no preferred depth. Any schema can 
contain both tables and (sub)schemas. So you can easily built a one- or 
two-level namespace structure, or whatever you want.

Calcite’s catalog has a single ‘root schema’ (analogous to the root directory, 
‘/‘ in file systems), and you can get to anything else from there.

In JDBC parlance, a a level 1 namespace is called ‘catalog’, and a level 2 
namespace is a ’schema’. If a DB has a one-level namespace then catalog will be 
null, or the empty string, or something.

If you’re running an Avatica JDBC server backed by a particular Calcite root 
schema, and you want your database to look like a one-level or two-level 
database, we probably don’t make it particularly easy.

Julian


> On Jan 27, 2022, at 7:25 AM, Gavin Ray  wrote:
> 
> My RDBMS experience is nearly exclusively Postgres
> While working on this project, I've made the assumption that the structure
> of a database is:
> 
> Database -> Schema -> Table
> 
> It turns out that this isn't accurate. In MySQL for instance, "Schema" is
> an alias for "DB".
> From the below StackOverflow answer, it seems like this is all over the
> place:
> 
> https://stackoverflow.com/a/7944489/13485494
> 
> I have a "CalciteSchemaManager" object which has a "rootSchema" to which
> all datasources are attached
> This "rootSchema" is used to generate the GraphQL API and types
> 
> It seems like I have two options, and I'm not sure which is a better design:
> 
> 1. Force all datasources to conform to (Database -> Schema -> Table)
> 
> This means that adding a new MySQL database, would generate ("mysql_db" ->
> "root" (fake schema) -> "some_table")
> Adding a CSV schema too, would be something like ("csv_datasource" ->
> "root" -> "some_csv_file")
> 
> 2. Have an irregular data shape. Datasources can be of arbitrary sub-schema
> depth.
> 
> Example Postgres: ("pg_db_1" -> "public" -> "user")
> Example MySQL:   ("mysql_db_1" -> "user")
> Example CSV: ("some_csv_file") or maybe ("csv_source_1" -> "some_csv_file")
> 
> What do you folks think I ought to do?
> Thank you =)



Re: [DISCUSS] Refactoring tests

2022-01-25 Thread Julian Hyde
I think sub-classing and parameterizing tests are both valid approaches. I 
agree that during review of 4952, parameterizing looked like the best option. 
But you had to add metadataConfig as a a parameter to lots of methods, which 
caused a massive conflict with my changes, and paving over your changes was the 
right approach.

Parameterized tests are probably best when the all parameter values are 
“similar". But they wouldn’t work well if, say, the parameter is a particular 
parser implementation and one parser implementation lives in a project that 
Calcite cannot even see.

My ’test fixture’ approach works well for that scenario because in the 
dependent project you would sub-class the test to provide a modified fixture 
(with, say, a different parser and perhaps some other changes to configuration) 
and then the tests would all run on that fixture.

A compromise approach might be for each test method to accept a fixture as the 
one and only argument. Then you could change other aspects of the fixture 
without having to refactor every test method.

I did ensure that the same tests were being run before and after rebase. The 
biggest concern was whether the MetadataConfig is being set correctly in the 
tests. I had to create MetadataConfig.NOP and use it in 3 tests because 
otherwise the cluster’s metadata was being overwritten mid-way through the 
test. Different overloads of MetadataConfig.applyMetadata were called at 
different points within the test, and they would conflict. The NOP was a way to 
neutralize the conflicting calls. I’m not proud of that workaround, but I 
believe the necessary tests get run.

Julian


> On Jan 25, 2022, at 3:34 PM, Jacques Nadeau  wrote:
> 
>> 
>> In your fix for https://issues.apache.org/jira/browse/CALCITE-4952 <
>> https://issues.apache.org/jira/browse/CALCITE-4952> you made
>> RelMetadataTest a parameterized test, so that each case could be called
>> with two metadata providers. I undid that, and made it into a test with a
>> protected fixture() method, so that a sub-class can run the same tests with
>> a different provider. I believe that the test coverage is the same, albeit
>> via different means.
> 
> 
> Hmm. I actually started with the subclass pattern but after feedback during
> review, Stamatis and I agreed that subclassing was an anti-pattern and thus
> I went through the "fun" process of refactoring to use parameterization.
> 
> In terms of confirming that the change to RelMetadataTest didn't disrupt
> coverage, any chance you ran a diff on test output pre/post to confirm that
> at least all the same test cases are running? (That's a really hard giant
> file diff to compare given all the refactoring.)
> 
> 
> 
> On Sat, Jan 22, 2022 at 9:41 PM Julian Hyde  wrote:
> 
>> Gavin, Thanks for the kind words.
>> 
>> All, I have now squashed and rebased onto latest master. The squashed
>> commit is
>> https://github.com/julianhyde/calcite/commit/cb59231a72e23be260b21670012af33c47c2610e.
>> I plan to merge to master on Monday.
>> 
>> Jacques, You may wish to carefully review my changes to RelMetadataTest.
>> In your fix for https://issues.apache.org/jira/browse/CALCITE-4952 <
>> https://issues.apache.org/jira/browse/CALCITE-4952> you made
>> RelMetadataTest a parameterized test, so that each case could be called
>> with two metadata providers. I undid that, and made it into a test with a
>> protected fixture() method, so that a sub-class can run the same tests with
>> a different provider. I believe that the test coverage is the same, albeit
>> via different means.
>> 
>> Julian
>> 
>> 
>>> On Jan 22, 2022, at 9:53 AM, Gavin Ray  wrote:
>>> 
>>> Thank you for doing this, after reading the overview these changes seem
>>> like they will make a number of things easier related to testing.
>>> Super useful too when you're trying to start building something with
>>> Calcite but don't know it well enough to figure out how to put all the
>>> pieces together yourself.
>>> 
>>> On Fri, Jan 21, 2022 at 3:57 PM Julian Hyde 
>> wrote:
>>> 
>>>> If it helps you review, I have created a ’summary’ document with a
>>>> description of the changes. It will become the commit message when I
>>>> squash, rebase, and merge to master.
>>>> 
>>>> 
>>>> 
>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md
>>>> <
>>>> 
>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md
>>> 
>>>> 
>>>> 
>>>> Julian
>>>> 
>>>

Re: Apache Calcite - Usage of Querable table/Querable/RawQueryable/etc interfaces

2022-01-25 Thread Julian Hyde
This reminds me that there’s an opportunity to make it easier to write 
adapters. There is a continuum of backends for adapters, from simple (such as 
CSV, which supports project and maybe filter) to powerful (such as 
Elasticsearch, which supports full relational algebra and then some). I am 
thinking about the latter case.

Let’s suppose we are writing an adapter for a database Foo which is 
approximately as powerful as SQL. Typically we write such adapters by creating 
a Foo convention, then a FooToEnumerableConverter, and then write rules to push 
each relational operator down through the membrane.

(Elasticsearch, Cassandra and Druid are all in this category, and all use the 
basic ‘membrane’ principle, but it is worrying that their adapters are so 
dissimilar.)

If Foo has ten relational operators, you need to write ten RelNode classes and 
ten rules. That is a considerable amount of code, which ends up looking like 
boilerplate. I wonder whether we can remove the need to create those 
Foo-specific classes and rules.  Could people just write a FooQuery relational 
expression that had a sequence of lightweight relational operations inside it?

Note that in MutableRel we have lightweight equivalents to RelNodes, for 
example MutableProject [1], and they could be repurposed.

The adapter would still have some hard work to do: translating RexNode to their 
expression language (splitting or rejecting expressions that they can’t 
handle), figuring out which sequences of relational operations are valid, and 
translating the FooQuery. But if we reduce the boilerplate, the adapter author 
can focus on the hard parts.

Julian

[1] 
https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/main/java/org/apache/calcite/rel/mutable/MutableProject.java#L34

> On Jan 23, 2022, at 12:28 PM, Stamatis Zampetakis  wrote:
> 
> To the best of my knowledge there is no document getting into details about
> the Queryable interface etc. The best way to approach this as Gavin
> mentioned is to see how it is used in the project and outside of it.
> 
> One thing that I wanted to mention is that you can build a system using
> Calcite without using these interfaces at all. If you consider for instance
> the druid module in Calcite I think it doesn't rely on Queryable at all.
> 
> Best,
> Stamatis
> 
> 
> On Wed, Jan 19, 2022 at 4:00 PM Gavin Ray  wrote:
> 
>> My apologies, I didn't realize that the updated search wasn't public now
>> (it opened for me a few days ago)
>> 
>> Regular GH links:
>> https://github.com/search?q=rawqueryable+language%3AJava=code
>> 
>> Searching "CalciteConnection" across all repos, but removing
>> "apache/calcite" from the results:
>> 
>> https://github.com/search?q=CalciteConnection+-repo%3Aapache%2Fcalcite=code
>> 
>> 
>> 
>> On Tue, Jan 18, 2022 at 12:17 PM M Singh 
>> wrote:
>> 
>>> Thanks Gavin for your pointers.  I am in the waiting list for the search
>>> tool you mentioned.  I will continue to search on google in the
>> meanwhile.
>>> Mans
>>> 
>>> 
>>>On Tuesday, January 18, 2022, 08:44:31 AM EST, Gavin Ray <
>>> ray.gavi...@gmail.com> wrote:
>>> 
>>> Not to be dismissive, but one thing I've done that has been helpful for
>>> me,
>>> is to search all public repos for usage/examples of the thing I am trying
>>> to understand.
>>> For example, this "RawQueryable" does not appear to be utilized in any
>>> user-facing code on Github, so it may not be useful for you:
>>> 
>>> 
>> https://cs.github.com/?q=rawqueryable%20language%3AJava=All%20repos=
>>> 
>>> However, take something like "CalciteConnection" or "RelNode", try to
>>> remove the Calcite repo itself, and you'll get a lot of hits and usage
>>> examples:
>>> 
>>> 
>> https://cs.github.com/?scopeName=All+repos==CalciteConnection+%28NOT+path%3A%22calcite%2F%22%29
>>> 
>>> (I find that searching Scala repos usually gives more relevant results,
>> due
>>> to lots of Java results being duplicates)
>>> 
>>> This isn't nearly as good as getting a direct answer, but sometimes you
>>> might be able to find what you're looking for
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Jan 18, 2022 at 5:39 AM M Singh 
>>> wrote:
>>> 
 Hey Folks:
 Just wanted to see if you have any pointers to help me understand the
 queryable concepts.
 Thanks
   On Friday, January 14, 2022, 09:09:06 AM EST, M Singh
  wrote:
 
 Hi Folks:
 A few newbie questions -
 1. I wanted to find out what are the benefits of a queryable table (
 
>>> 
>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/schema/QueryableTable.html
>> )
>>> and
 what are the pros and cons of using it.
 2. What is the role of Querable interface (
 
>>> 
>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/linq4j/Queryable.html
>>> )
 ?3. There is a getExpression methods in queryabletable (
 
>>> 
>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/linq4j/tree/Expression.html
>>> ),

Re: Question on how to integrate Apache IoTDB into Calcite

2022-01-25 Thread Julian Hyde
+1 what Stamatis said. Queryable is for compatibility with LINQ. If you want to 
build an adapter that supports push-down, you will likely use FilterableTable 
for simple adapters, TranslatableTable for more complex adapters. In neither 
case will you need to deal with Queryable.

Stamatis laid out best practices in his excellent BOSS tutorial: 
https://www.youtube.com/watch?v=meI0W12f_nw 
. (I am co-presenter in other 
parts of the tutorial, but all credit for the adapters material goes to 
Stamatis.)

Julian



> On Jan 23, 2022, at 12:46 PM, Stamatis Zampetakis  wrote:
> 
> Hi Julian,
> 
> I don't think there is an easy way to understand the Queryable interface
> unless you check how it is used. I don't see it as something that you need
> to implement no matter what but more as a convenience API that will
> facilitate the integration with the Enumerable operators (if you rely on
> them). Even in that case it could be possible to skip it.
> 
> There are many ways to push a filter down into the underlying engine and I
> think the Calcite code base has already quite a few examples on how this
> can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
> option that is best in all cases. Using one you may (e.g., FilterableTable)
> write more concise code and using another (e.g., custom rules + custom
> operators) may lead to more powerful optimizations or an easier extensible
> system.
> 
> Regarding materialized views there are many built in things in Calcite. The
> best place to start would probably be the website [1]
> 
> The bindable convention uses interpretation most of the time but it also
> involves code generation in some parts (e.g., BindableFilter). The
> Enumerable convention is more widely used in the wild so I would say it is
> a more stable and better option to begin with. Afterwards you may need to
> invest in building in-house operators to solve some invonveniences of
> Calcite built-in conventions.
> 
> Best,
> Stamatis
> 
> [1] https://calcite.apache.org/docs/materialized_views.html
> 
> On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
> j.feina...@pragmaticminds.de> wrote:
> 
>> Hi all,
>> 
>> in the last weeks I worked on Integrating the Apache IoTDB Project with
>> Calcite.
>> This covers two possible scenarios. One, to use Apache IoTDB as an Adapter
>> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other hand we
>> are looking at using Calcites Query Optimizer to introduce indexing into
>> the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes it to
>> the planner, after planning the resulting RelNode is then processed further
>> by the IoTDB Server, executed and returned).
>> 
>> I looked a lot on the other Adapters and how they are implemented and have
>> some questions:
>> 
>> One rather general question is about the Queryable<> Interface. I tried to
>> look up all the docs (also in Linq) but still not fully understand it. From
>> my understanding it is like a Enumerable<> but it has a “native” way to
>> already to things like ordering or filtering. So if I have a Queryable<>
>> which implements a custom Filter an automated “Push Down” can be done by
>> the framework without a Rule or code generation.
>> 
>> One important requirement for us in IoTDB is to do the query pushdown to
>> the TableScan (which is done implicitly in the current server but is first
>> explicit in the RelNode that we generate).
>> So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan to a
>> “filtered” scan?
>> Is the right way to return a QueryableTable as TableScan and the Planner
>> will take care by generating the call to ‘.filter(…)’.
>> The same applies to ordering.
>> 
>> Another question that is important for us is the usage of “Materialized
>> Views” or other “Indexes”.
>> As we handle basically always timeseries in most cases the only suitable
>> index is a “Materialized View” on parts of the time series which we can use
>> to replace parts of the Relational Tree to avoid IO and computation for
>> parts that are already precomputed.
>> 
>> Is there already an existing support for that in Calcite or would we just
>> write custom Rules for our cases?
>> 
>> My last question is about the Callable TraitDef. So far I only used
>> Enumerable Convention which results in  Code generation (which has an
>> impact on the query latency). Am I right in assuming that the Binable
>> Convention is somehow similar to the Enumerable Convention with the only
>> difference that it does not do code generation but interpretation?
>> And to potentially use both (depending on whatever switch we set) we just
>> have to provide Converter Rules for both?
>> What would you use in a Server setup? Always Enumerable?
>> 
>> Thanks already for any responses or hints!
>> Julian F
>> 



Re: [DISCUSS] Refactoring tests

2022-01-24 Thread Julian Hyde
I have merged my PR, and marked 
https://issues.apache.org/jira/browse/CALCITE-4885 
<https://issues.apache.org/jira/browse/CALCITE-4885> fixed.

If my changes make it difficult to rebase, I apologize. If you need explanation 
for the changes I made, check the (very detailed) commit message, browse the 
un-squashed commits in 
https://github.com/julianhyde/calcite/tree/4885-test-fixtures.0 
<https://github.com/julianhyde/calcite/tree/4885-test-fixtures.0>, or ask for 
my help on this thread.

Julian


> On Jan 22, 2022, at 9:41 PM, Julian Hyde  wrote:
> 
> Gavin, Thanks for the kind words.
> 
> All, I have now squashed and rebased onto latest master. The squashed commit 
> is 
> https://github.com/julianhyde/calcite/commit/cb59231a72e23be260b21670012af33c47c2610e
>  
> <https://github.com/julianhyde/calcite/commit/cb59231a72e23be260b21670012af33c47c2610e>.
>  I plan to merge to master on Monday.
> 
> Jacques, You may wish to carefully review my changes to RelMetadataTest. In 
> your fix for https://issues.apache.org/jira/browse/CALCITE-4952 
> <https://issues.apache.org/jira/browse/CALCITE-4952> you made RelMetadataTest 
> a parameterized test, so that each case could be called with two metadata 
> providers. I undid that, and made it into a test with a protected fixture() 
> method, so that a sub-class can run the same tests with a different provider. 
> I believe that the test coverage is the same, albeit via different means.
> 
> Julian
> 
> 
>> On Jan 22, 2022, at 9:53 AM, Gavin Ray > <mailto:ray.gavi...@gmail.com>> wrote:
>> 
>> Thank you for doing this, after reading the overview these changes seem
>> like they will make a number of things easier related to testing.
>> Super useful too when you're trying to start building something with
>> Calcite but don't know it well enough to figure out how to put all the
>> pieces together yourself.
>> 
>> On Fri, Jan 21, 2022 at 3:57 PM Julian Hyde > <mailto:jhyde.apa...@gmail.com>> wrote:
>> 
>>> If it helps you review, I have created a ’summary’ document with a
>>> description of the changes. It will become the commit message when I
>>> squash, rebase, and merge to master.
>>> 
>>> 
>>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md
>>>  
>>> <https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md>
>>> <
>>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md>
>>> 
>>> 
>>> Julian
>>> 
>>> 
>>>> On Jan 19, 2022, at 9:08 PM, Jacques Nadeau  wrote:
>>>> 
>>>> Unfortunately, the last minute attendance of the meetup today threw my
>>>> schedule off and I won't be able to look at this for at least a few days.
>>>> Don't feel obligated to hold up for me.
>>>> 
>>>> On Wed, Jan 19, 2022, 9:04 AM Jacques Nadeau  wrote:
>>>> 
>>>>> FYI, I'm trying to do a thorough review today (as much as possible with
>>>>> patch this size).
>>>>> 
>>>>> On Wed, Jan 19, 2022, 4:36 AM Michael Mior  wrote:
>>>>> 
>>>>>> Thanks for doing this Julian! I haven't been able to do a detailed
>>> review,
>>>>>> but from my skim of the PR, this looks like a nice improvement. I think
>>>>>> it's always been a bit difficult for new Calcite developers to write
>>> good
>>>>>> tests, especially for new modules. So anything that helps that
>>> experience
>>>>>> is very welcome.
>>>>>> 
>>>>>> --
>>>>>> Michael Mior
>>>>>> mm...@apache.org
>>>>>> 
>>>>>> 
>>>>>> Le mer. 17 nov. 2021 à 01:15, Vladimir Sitnikov <
>>>>>> sitnikov.vladi...@gmail.com>
>>>>>> a écrit :
>>>>>> 
>>>>>>> Jacques>This sounds like it will mean we will need to make
>>> calcite-core
>>>>>>> test artifacts available
>>>>>>> 
>>>>>>> Test artifacts publication is yet another anti-pattern just like "base
>>>>>> test
>>>>>>> class".
>>>>>>> This change has been discussed:
>>>>>>> https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1
>>>>>>> 
>>>>>>> If you want to depend on a class from tests, consider moving it to
>>>>>> /testkit
>>>>>>> module:
>>>>>>> 
>>>>>>> 
>>>>>> 
>>> https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit
>>>>>>> 
>>>>>>> Jacques>We should think about the rules around Kotlin
>>>>>>> 
>>>>>>> What happens in calcite-core/tests stays in calcite-core/tests :)
>>>>>>> 
>>>>>>> It is reasonable to assume that testkit module would have
>>> dependencies,
>>>>>>> and testkit would provide API that is usable from Java and other JVM
>>>>>>> languages.
>>>>>>> 
>>>>>>> In that regard, Kotlin dependency in testkit is not much different
>>> from
>>>>>>> Quidem or commons-lang3.
>>>>>>> Consumers might use Quidem if it fits just like they could use Kotlin
>>>>>> if it
>>>>>>> fits.
>>>>>>> 
>>>>>>> Vladimir
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 
> 



Re: [DISCUSS] New community section for articles/blogs/papers

2022-01-24 Thread Julian Hyde
I love the idea of community content. Some of these projects will be the next 
generation of Calcite, and the authors our next PMC members. 

I share people’s concerns about the list items going obsolete, or the getting 
too long. How about moving older items to the 
https://calcite.apache.org/community/#more-talks 
 section, keeping only the 5 
or 6 most recent, and removing obsolete items entirely?

Also, any thoughts on using News for short articles about interesting projects? 
https://calcite.apache.org/news/  News 
articles are easy to write (just one .md file) and the web site generator 
naturally puts recent articles at the top of the page.

Julian

> On Jan 24, 2022, at 7:34 AM, Alessandro Solimando 
>  wrote:
> 
> +1 on the proposal, I share Ruben's POV also, but I think we can revise the
> content from time to time and deprecate/remove links when they get stale.
> 
> Best regards,
> Alessandro
> 
> On Mon, 24 Jan 2022 at 16:29, Zhe Hu  wrote:
> 
>> It’s definitely a terrific proposal, especially for those new comers and
>> people who want to share something useful with the community.
>> 
>> 
>> I totally agree with what Jing Zhan and Ruben Q L mentioned, those useful
>> stuff has higher requirements for authors and reviewers. What’s more, for
>> each release, the RM might need to take a little while to briefly check if
>> those materials are still available.
>> 
>> 
>> Still, +1 for this.
>> 
>> 
>> Best,
>> ZheHu
>> 
>> 
>> On 01/24/2022 17:47,Ruben Q L wrote:
>> Thanks for the proposal Stamatis.
>> 
>> I agree that such a section could be very helpful, especially for new
>> comers. As you suggest, perhaps a new bullet "external resources" (or
>> something like that) under the community page?
>> 
>> I also share the concerns that Jing Zhang expressed. Moreover, there's a
>> risk that over the time some of the links that we share may become outdated
>> or obsolete.
>> 
>> Best,
>> Ruben
>> 
>> 
>> 
>> 
>> On Mon, Jan 24, 2022 at 3:20 AM Jing Zhang  wrote:
>> 
>> Hi Stamatis,
>> 
>> Thanks for driving this discussion.
>> 
>> Big +1 on the idea. They are undoubtedly very helpful for newcomers to
>> Calcite. I have heard a lot of requests in the dev mailing lists to get
>> more started documentation.
>> 
>> At the same time, we need take careful to decide what contents to put in
>> this section. Because the content introduced on the community page means
>> that the content has been recognized by the community and is more
>> authoritative.
>> 
>> Best,
>> Jing Zhang
>> 
>> 
>> 
>> Stamatis Zampetakis  于2022年1月24日周一 05:19写道:
>> 
>> Hi all,
>> 
>> Taking the recent email of Gavin [1] (sharing his article on GraphQL to
>> SQL
>> conversion) as a motive, I wanted to see what people think of creating a
>> new section in the website for putting this and other useful stuff such
>> as
>> Michael's notebooks, Querifylabs blogs [2], etc.
>> 
>> We could have one or more new bullets under the community page [3]. WDYT?
>> 
>> Best,
>> Stamatis
>> 
>> [1] https://lists.apache.org/thread/3kgwsg18bglszfm51ngy1m58fnh9dd2c
>> [2] https://www.querifylabs.com/blog
>> [3] https://calcite.apache.org/community/
>> 
>> 
>> 



Re: [DISCUSS] Refactoring tests

2022-01-22 Thread Julian Hyde
Gavin, Thanks for the kind words.

All, I have now squashed and rebased onto latest master. The squashed commit is 
https://github.com/julianhyde/calcite/commit/cb59231a72e23be260b21670012af33c47c2610e.
 I plan to merge to master on Monday.

Jacques, You may wish to carefully review my changes to RelMetadataTest. In 
your fix for https://issues.apache.org/jira/browse/CALCITE-4952 
<https://issues.apache.org/jira/browse/CALCITE-4952> you made RelMetadataTest a 
parameterized test, so that each case could be called with two metadata 
providers. I undid that, and made it into a test with a protected fixture() 
method, so that a sub-class can run the same tests with a different provider. I 
believe that the test coverage is the same, albeit via different means.

Julian


> On Jan 22, 2022, at 9:53 AM, Gavin Ray  wrote:
> 
> Thank you for doing this, after reading the overview these changes seem
> like they will make a number of things easier related to testing.
> Super useful too when you're trying to start building something with
> Calcite but don't know it well enough to figure out how to put all the
> pieces together yourself.
> 
> On Fri, Jan 21, 2022 at 3:57 PM Julian Hyde  wrote:
> 
>> If it helps you review, I have created a ’summary’ document with a
>> description of the changes. It will become the commit message when I
>> squash, rebase, and merge to master.
>> 
>> 
>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md
>> <
>> https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md>
>> 
>> 
>> Julian
>> 
>> 
>>> On Jan 19, 2022, at 9:08 PM, Jacques Nadeau  wrote:
>>> 
>>> Unfortunately, the last minute attendance of the meetup today threw my
>>> schedule off and I won't be able to look at this for at least a few days.
>>> Don't feel obligated to hold up for me.
>>> 
>>> On Wed, Jan 19, 2022, 9:04 AM Jacques Nadeau  wrote:
>>> 
>>>> FYI, I'm trying to do a thorough review today (as much as possible with
>>>> patch this size).
>>>> 
>>>> On Wed, Jan 19, 2022, 4:36 AM Michael Mior  wrote:
>>>> 
>>>>> Thanks for doing this Julian! I haven't been able to do a detailed
>> review,
>>>>> but from my skim of the PR, this looks like a nice improvement. I think
>>>>> it's always been a bit difficult for new Calcite developers to write
>> good
>>>>> tests, especially for new modules. So anything that helps that
>> experience
>>>>> is very welcome.
>>>>> 
>>>>> --
>>>>> Michael Mior
>>>>> mm...@apache.org
>>>>> 
>>>>> 
>>>>> Le mer. 17 nov. 2021 à 01:15, Vladimir Sitnikov <
>>>>> sitnikov.vladi...@gmail.com>
>>>>> a écrit :
>>>>> 
>>>>>> Jacques>This sounds like it will mean we will need to make
>> calcite-core
>>>>>> test artifacts available
>>>>>> 
>>>>>> Test artifacts publication is yet another anti-pattern just like "base
>>>>> test
>>>>>> class".
>>>>>> This change has been discussed:
>>>>>> https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1
>>>>>> 
>>>>>> If you want to depend on a class from tests, consider moving it to
>>>>> /testkit
>>>>>> module:
>>>>>> 
>>>>>> 
>>>>> 
>> https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit
>>>>>> 
>>>>>> Jacques>We should think about the rules around Kotlin
>>>>>> 
>>>>>> What happens in calcite-core/tests stays in calcite-core/tests :)
>>>>>> 
>>>>>> It is reasonable to assume that testkit module would have
>> dependencies,
>>>>>> and testkit would provide API that is usable from Java and other JVM
>>>>>> languages.
>>>>>> 
>>>>>> In that regard, Kotlin dependency in testkit is not much different
>> from
>>>>>> Quidem or commons-lang3.
>>>>>> Consumers might use Quidem if it fits just like they could use Kotlin
>>>>> if it
>>>>>> fits.
>>>>>> 
>>>>>> Vladimir
>>>>>> 
>>>>> 
>>>> 
>> 
>> 



Re: [DISCUSS] Refactoring tests

2022-01-21 Thread Julian Hyde
If it helps you review, I have created a ’summary’ document with a description 
of the changes. It will become the commit message when I squash, rebase, and 
merge to master.

https://github.com/julianhyde/calcite/blob/4885-test-fixtures/summary-of-calcite-4885.md
 

 

Julian


> On Jan 19, 2022, at 9:08 PM, Jacques Nadeau  wrote:
> 
> Unfortunately, the last minute attendance of the meetup today threw my
> schedule off and I won't be able to look at this for at least a few days.
> Don't feel obligated to hold up for me.
> 
> On Wed, Jan 19, 2022, 9:04 AM Jacques Nadeau  wrote:
> 
>> FYI, I'm trying to do a thorough review today (as much as possible with
>> patch this size).
>> 
>> On Wed, Jan 19, 2022, 4:36 AM Michael Mior  wrote:
>> 
>>> Thanks for doing this Julian! I haven't been able to do a detailed review,
>>> but from my skim of the PR, this looks like a nice improvement. I think
>>> it's always been a bit difficult for new Calcite developers to write good
>>> tests, especially for new modules. So anything that helps that experience
>>> is very welcome.
>>> 
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> Le mer. 17 nov. 2021 à 01:15, Vladimir Sitnikov <
>>> sitnikov.vladi...@gmail.com>
>>> a écrit :
>>> 
 Jacques>This sounds like it will mean we will need to make calcite-core
 test artifacts available
 
 Test artifacts publication is yet another anti-pattern just like "base
>>> test
 class".
 This change has been discussed:
 https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1
 
 If you want to depend on a class from tests, consider moving it to
>>> /testkit
 module:
 
 
>>> https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit
 
 Jacques>We should think about the rules around Kotlin
 
 What happens in calcite-core/tests stays in calcite-core/tests :)
 
 It is reasonable to assume that testkit module would have dependencies,
 and testkit would provide API that is usable from Java and other JVM
 languages.
 
 In that regard, Kotlin dependency in testkit is not much different from
 Quidem or commons-lang3.
 Consumers might use Quidem if it fits just like they could use Kotlin
>>> if it
 fits.
 
 Vladimir
 
>>> 
>> 



Re: Parsing multiple SQL statements using Calcite

2022-01-21 Thread Julian Hyde
You are right. We support parsing multiple statements [1] but we don’t support 
multi-statement validation, multi-statement planning or multi-statement 
execution.

Julian


[1] https://issues.apache.org/jira/browse/CALCITE-2453 
.


> On Jan 20, 2022, at 8:23 PM, Laksh Singla  wrote:
> 
> Hi,
> 
> I was wondering if there's a way to parse multiple SQL statements in
> Calcite. In the reference grammar, there is a "statementList" production
> rule which allows for parsing multiple statements. However, in the
> Calcite's code, I didn't find any usage of the method "SqlNodeList
> parseStmtList" which seems to utilize the production rule. I do see that
> the method is public, so is there another way to utilize it (apart from
> directly calling `planner.parse(sqlString)`.
> 
> Thanks and Regards
> Laksh



Re: How to write code if the intent is to land it in Calcite eventually?

2022-01-20 Thread Julian Hyde
There is one other front-end language (besides SQL) in Calcite: the ‘piglet' 
module implements Pig Latin (which is the language spoken by Apache Pig). I 
think the other languages, such as GraphQL, should each be their own module.

Calcite has many adapters (i.e. back-ends), and one-module-per-adapter has been 
working well.  

I’m not going to express an opinion about Kotlin vs Java 17 vs Java 8.


> On Jan 20, 2022, at 12:25 PM, Gavin Ray  wrote:
> 
> Stamatis made a remark during the meetup about integrating the GraphQL
> library as a module in core.
> Something like GqlToRelConverter, similar to SqlToRelConverter.
> 
> I hadn't thought of this, it's a brilliant idea.
> I would like to shift my development into practices that could eventually
> be accepted upstream.
> 
> Is it correct to assume that this means I'll need to swap Kotlin for Java?
> Are there any organizational best-practices, conventions or other things to
> keep in mind?
> 
> If I must use Java, can I use JDK17 to take advantage of things like
> records, sealed types and "switch" pattern matching?
> 
> Thank you =)



Re: Help with SqlAdvisor and autocomplete please!

2022-01-20 Thread Julian Hyde
There’s an existing test for this, and it passes. And I added some more in 
https://github.com/julianhyde/calcite/commit/831318278e0ecccf814b82eb8a944544874275a6
 
.
 How is your case (or environment) different from these?

> On Jan 20, 2022, at 2:40 AM, mark  wrote:
> 
> Hello,
> Please can I ask for some help or guidance with SqlAdvisor and completion?
> My problem is that I can get completion working for table names, but not
> column names.
> 
>  "select a.mgr from ^stuff a"
> 
> gives me good results:  CATALOG.SALES.EMP, CATALOG.SALES, etc, but
> 
>  "select a.^ from sales.emp a"
> 
> gives me only "*". See
> https://github.com/mnuttall/sql-testing/blob/main/src/main/java/test/AdvisorBuilder.java
> for how I'm constructing my SqlAdvisor and
> https://github.com/mnuttall/sql-testing/blob/main/src/test/java/test/TestCompletion.java
> for some simple test code.
> 
> Can anyone tell me - should I be able to get column name suggestions from
> org.apache.calcite.sql.advise.SqlAdvisor.getCompletionHints() for a string
> like "select a.^ from sales.emp a" ? Can anyone see what's wrong with my
> SqlAdvisor construction? Very many thanks in advance - I'm a bit stuck : /
> 
> Regards,
> 
> Mark



Re: New rule for Converting UNION ALL with same inputs but different filters to single input with OR FILTER

2022-01-20 Thread Julian Hyde
Process logic sounds great. Please log a Jira case.

There’s a lot here, so it’s probably wise to split into a simple PR that just 
does the simple case (UNION ALL, no Project) and extend later.

> On Jan 20, 2022, at 7:07 AM, Yanjing Wang  wrote:
> 
> Thanks Julian and Justin.
> 
> What do you think the rule should be called? UnionFilterTransposeRule,
>> perhaps?
>> 
> SetOpFilterMergeRule?
> 
> Maybe that problem does not occur when applied to UNION than when applied
>> to UNION ALL.
> 
> Yes, This is very important.
> 
> There would seem to be analogous rules for INTERSECT (combine the
>> conditions using AND) and EXCEPT (combine the conditions using AND NOT).
>> 
> Excellent extensions, all the three operators process logic are:
> 
> For UNION:
> New Filter = left Filter *OR* right Filter.
> 
> For INTERSECT:
> New Filter = left Filter *AND* right Filter.
> 
> For EXCEPT:
> If left Filter, right Filter have no overlap, transform to left child tree,
> Otherwise
> New Filter = left Filter *AND NOT *right Filter.
> 
> For UNION ALL:
> Add prerequisites:
> left Filter, right Filter must have no overlap.
> 
> For INTERSECT ALL:
> Add prerequisites:
> If left Filter, right Filter have no overlap, transform to empty values.
> Otherwise
> the rule can't be applied.
> 
> For EXCEPT ALL:
> same for  EXCEPT.
> 
> 
> work for N-way Union, not just 2-way Union.
>> 
> Yes, I will add tests for this.
> 
> And I think you should make it work whether or not a Project is present.
> 
> Ok, It seems I need construct several different operand match trees for
> plan.
> 
> Hi Julian, what do you think of the above process logic?
> 
> Julian Hyde mailto:jhyde.apa...@gmail.com>> 
> 于2022年1月20日周四 10:18写道:
> 
>> Justin,
>> 
>> For planning table or index scans, I would recommend using a single
>> TableScan with a Filter that uses a Sarg, rather than using multiple
>> TableScans connected by a Union. So I think this rule will be useful.
>> 
>> But I do agree that this proposed rule is not a “no brainer”. It may not
>> do what people want/expect in all cases, and therefore it probably should
>> not be enabled it by default.
>> 
>> Julian
>> 
>> 
>> 
>> 
>> 
>>> On Jan 19, 2022, at 3:38 PM, Justin Swanhart 
>> wrote:
>>> 
>>> Hi,
>>> 
>>> Note that this will negate the optimization that one usually is looking
>> for
>>> when writing such queries:
>>> 
>>> Select * from TAB where a = 1
>>> UNION ALL
>>> Select * from TAB where b = 1
>>> 
>>> In a database with indexes (most databases) this will allow indexes to be
>>> used on both the a column and the b column.
>>> Databases with bitmap indexes or without indexes would benefit from the
>>> rule.
>>> 
>>> On Wed, Jan 19, 2022 at 4:32 PM Julian Hyde > <mailto:jhyde.apa...@gmail.com <mailto:jhyde.apa...@gmail.com>>> wrote:
>>> 
>>>> Can you log a Jira case for this?
>>>> 
>>>> I think you should make your rule work for N-way Union, not just 2-way
>>>> Union. And I think you should make it work whether or not a Project is
>>>> present.
>>>> 
>>>>> On Jan 19, 2022, at 1:25 PM, Julian Hyde >>>> <mailto:jhyde.apa...@gmail.com>>
>> wrote:
>>>>> 
>>>>> It sounds useful.
>>>>> 
>>>>> What do you think the rule should be called? UnionFilterTransposeRule,
>>>> perhaps?
>>>>> 
>>>>> A challenge when writing the rule will be to ensure that all of the
>>>> inputs to the Union are the same. The Volcano framework is not very
>> good at
>>>> that.
>>>>> 
>>>>> You should be careful of the case that the conditions overlap. For
>>>> example, the rewrite
>>>>> 
>>>>> SELECT * FROM Emp WHERE deptno < 30
>>>>> UNION ALL
>>>>> SELECT * FROM Emp WHERE deptno IN (25, 35, 45)
>>>>> 
>>>>> to
>>>>> 
>>>>> SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45)
>>>>> 
>>>>> Is not valid, because rows with deptno = 25 will appear twice in the
>>>> first query, once in the second. Maybe that problem does not occur when
>>>> applied to UNION than when applied to UNION ALL.
>>>>> 
>>>>> There would seem to be analogous rules for INTERSECT (c

Re: [ANNOUNCE] New Calcite PMC chair: Ruben Q L

2022-01-19 Thread Julian Hyde
Congratulations, Ruben, and good luck!

Haisheng, Thank you for serving as Chair.

Julian

On Wed, Jan 19, 2022 at 6:29 PM Haisheng Yuan  wrote:
>
> Calcite community members,
>
> I am pleased to announce that we have a new PMC chair and VP as per our
> tradition of rotating the chair once a year. I have resigned, and Ruben Q L
> was duly elected by the PMC and approved unanimously by the Board.
>
> Please join me in congratulating Ruben!
>
> Best,
> Haisheng Yuan


Re: New rule for Converting UNION ALL with same inputs but different filters to single input with OR FILTER

2022-01-19 Thread Julian Hyde
Justin,

For planning table or index scans, I would recommend using a single TableScan 
with a Filter that uses a Sarg, rather than using multiple TableScans connected 
by a Union. So I think this rule will be useful.

But I do agree that this proposed rule is not a “no brainer”. It may not do 
what people want/expect in all cases, and therefore it probably should not be 
enabled it by default.

Julian





> On Jan 19, 2022, at 3:38 PM, Justin Swanhart  wrote:
> 
> Hi,
> 
> Note that this will negate the optimization that one usually is looking for
> when writing such queries:
> 
> Select * from TAB where a = 1
> UNION ALL
> Select * from TAB where b = 1
> 
> In a database with indexes (most databases) this will allow indexes to be
> used on both the a column and the b column.
> Databases with bitmap indexes or without indexes would benefit from the
> rule.
> 
> On Wed, Jan 19, 2022 at 4:32 PM Julian Hyde  <mailto:jhyde.apa...@gmail.com>> wrote:
> 
>> Can you log a Jira case for this?
>> 
>> I think you should make your rule work for N-way Union, not just 2-way
>> Union. And I think you should make it work whether or not a Project is
>> present.
>> 
>>> On Jan 19, 2022, at 1:25 PM, Julian Hyde  wrote:
>>> 
>>> It sounds useful.
>>> 
>>> What do you think the rule should be called? UnionFilterTransposeRule,
>> perhaps?
>>> 
>>> A challenge when writing the rule will be to ensure that all of the
>> inputs to the Union are the same. The Volcano framework is not very good at
>> that.
>>> 
>>> You should be careful of the case that the conditions overlap. For
>> example, the rewrite
>>> 
>>>  SELECT * FROM Emp WHERE deptno < 30
>>>  UNION ALL
>>>  SELECT * FROM Emp WHERE deptno IN (25, 35, 45)
>>> 
>>> to
>>> 
>>>  SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45)
>>> 
>>> Is not valid, because rows with deptno = 25 will appear twice in the
>> first query, once in the second. Maybe that problem does not occur when
>> applied to UNION than when applied to UNION ALL.
>>> 
>>> There would seem to be analogous rules for INTERSECT (combine the
>> conditions using AND) and EXCEPT (combine the conditions using AND NOT).
>> Perhaps one rule could cover all set operations (see
>> FilterSetOpTransposeRule).
>>> 
>>> Julian
>>> 
>>> 
>>> 
>>>> On Jan 19, 2022, at 2:38 AM, Yanjing Wang > <mailto:zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com>>> wrote:
>>>> 
>>>> A simple example is converting SELECT a, b FROM t WHERE c = 1 UNION ALL
>> SELECT a, b FROM t WHERE c = 2 to SELECT a, b FROM t WHERE c in (1, 2)
>>>> 
>>>> Yanjing Wang mailto:zhuangzixiao...@gmail.com> 
>>>> > zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com>>> 于2022年1月19日周三 
>> 18:35写道:
>>>> Hi, community
>>>> 
>>>> Here I recommend a new rule for converting UNION ALL sub plan to a
>> single input with an OR filter, the following is its conversion diagram.
>>>> 
>>>> 
>>>> The conversion prerequisites are
>>>> 1. left filter range has no intersection with right filter range.
>>>> 2. Project and Input Sub Tree must be identical.
>>>> 
>>>> The rule will be used when Input Sub Tree is a computing-intensive or
>> large IO operation.
>>>> 
>>>> I don't know whether the community supports it or not, any suggestions
>> will be appreciated.



Re: New rule for Converting UNION ALL with same inputs but different filters to single input with OR FILTER

2022-01-19 Thread Julian Hyde
Can you log a Jira case for this?

I think you should make your rule work for N-way Union, not just 2-way Union. 
And I think you should make it work whether or not a Project is present.

> On Jan 19, 2022, at 1:25 PM, Julian Hyde  wrote:
> 
> It sounds useful.
> 
> What do you think the rule should be called? UnionFilterTransposeRule, 
> perhaps?
> 
> A challenge when writing the rule will be to ensure that all of the inputs to 
> the Union are the same. The Volcano framework is not very good at that.
> 
> You should be careful of the case that the conditions overlap. For example, 
> the rewrite
> 
>   SELECT * FROM Emp WHERE deptno < 30
>   UNION ALL
>   SELECT * FROM Emp WHERE deptno IN (25, 35, 45)
> 
> to
> 
>   SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45) 
> 
> Is not valid, because rows with deptno = 25 will appear twice in the first 
> query, once in the second. Maybe that problem does not occur when applied to 
> UNION than when applied to UNION ALL.
> 
> There would seem to be analogous rules for INTERSECT (combine the conditions 
> using AND) and EXCEPT (combine the conditions using AND NOT). Perhaps one 
> rule could cover all set operations (see FilterSetOpTransposeRule).
> 
> Julian
> 
> 
> 
>> On Jan 19, 2022, at 2:38 AM, Yanjing Wang > <mailto:zhuangzixiao...@gmail.com>> wrote:
>> 
>> A simple example is converting SELECT a, b FROM t WHERE c = 1 UNION ALL 
>> SELECT a, b FROM t WHERE c = 2 to SELECT a, b FROM t WHERE c in (1, 2)
>> 
>> Yanjing Wang mailto:zhuangzixiao...@gmail.com>> 
>> 于2022年1月19日周三 18:35写道:
>> Hi, community
>> 
>> Here I recommend a new rule for converting UNION ALL sub plan to a single 
>> input with an OR filter, the following is its conversion diagram.
>> 
>>  
>> The conversion prerequisites are 
>> 1. left filter range has no intersection with right filter range.
>> 2. Project and Input Sub Tree must be identical.
>> 
>> The rule will be used when Input Sub Tree is a computing-intensive or large 
>> IO operation.
>> 
>> I don't know whether the community supports it or not, any suggestions will 
>> be appreciated.
>> 
>> 
> 



Re: New rule for Converting UNION ALL with same inputs but different filters to single input with OR FILTER

2022-01-19 Thread Julian Hyde
It sounds useful.

What do you think the rule should be called? UnionFilterTransposeRule, perhaps?

A challenge when writing the rule will be to ensure that all of the inputs to 
the Union are the same. The Volcano framework is not very good at that.

You should be careful of the case that the conditions overlap. For example, the 
rewrite

  SELECT * FROM Emp WHERE deptno < 30
  UNION ALL
  SELECT * FROM Emp WHERE deptno IN (25, 35, 45)

to

  SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45) 

Is not valid, because rows with deptno = 25 will appear twice in the first 
query, once in the second. Maybe that problem does not occur when applied to 
UNION than when applied to UNION ALL.

There would seem to be analogous rules for INTERSECT (combine the conditions 
using AND) and EXCEPT (combine the conditions using AND NOT). Perhaps one rule 
could cover all set operations (see FilterSetOpTransposeRule).

Julian



> On Jan 19, 2022, at 2:38 AM, Yanjing Wang  wrote:
> 
> A simple example is converting SELECT a, b FROM t WHERE c = 1 UNION ALL 
> SELECT a, b FROM t WHERE c = 2 to SELECT a, b FROM t WHERE c in (1, 2)
> 
> Yanjing Wang mailto:zhuangzixiao...@gmail.com>> 
> 于2022年1月19日周三 18:35写道:
> Hi, community
> 
> Here I recommend a new rule for converting UNION ALL sub plan to a single 
> input with an OR filter, the following is its conversion diagram.
> 
>  
> The conversion prerequisites are 
> 1. left filter range has no intersection with right filter range.
> 2. Project and Input Sub Tree must be identical.
> 
> The rule will be used when Input Sub Tree is a computing-intensive or large 
> IO operation.
> 
> I don't know whether the community supports it or not, any suggestions will 
> be appreciated.
> 
> 



Re: [DISCUSS] Apache Calcite Online Meetup January 2022

2022-01-19 Thread Julian Hyde
For what it’s worth, I had trouble logging into Meetup.com (from both an iPhone 
and a macOS laptop) and so missed the meetup.

> On Jan 19, 2022, at 9:08 AM, Jacques Nadeau  wrote:
> 
> Shoot, totally missed that this was using meetup.com. Figured the date/time
> would be discussed here. In the future would be helpful to also mention
> time/date on the mailing list.
> 
> On Wed, Jan 19, 2022, 8:43 AM Stamatis Zampetakis  wrote:
> 
>> Forgo the link to the event:
>> https://www.meetup.com/Apache-Calcite/events/282836907/
>> 
>> On Wed, Jan 19, 2022 at 5:42 PM Stamatis Zampetakis 
>> wrote:
>> 
>>> Small reminder: the meetup starts in ~15 minutes.
>>> 
>>> Best,
>>> Stamatis
>>> 
>>> On Wed, Jan 12, 2022 at 11:30 PM Stamatis Zampetakis 
>>> wrote:
>>> 
 Hi,
 
 I read the previous emails about the GraphQL project and it looks
 very interesting.
 @Gavin: Please send us a title, abstract, and duration of the talk.
 Don't hesitate to take more time if you need to. I think we still have a
 bit of margin.
 
 With this we can freeze the agenda for this meetup. If there is more
 interest we can immediately start discussing the next one.
 
 Although I appear as an organizer in the Apache Calcite Meetup group,
 this does not mean I am the only one deciding about the agenda or when a
 meetup should happen.
 If other committers/PMC members want to take a more active role in the
 organization I would be more than happy to add them as co-organizers in
>> the
 group.
 Apart from that feedback & suggestions from anyone are always welcome.
 
 Best,
 Stamatis
 
 On Wed, Jan 12, 2022 at 8:36 PM Eugen Stan 
 wrote:
 
> I would love to see GraphQL over Calcite for sure.
> 
> On 12.01.2022 17:37, Gavin Ray wrote:
>> If anyone is interested, I can spend 5-10 minutes briefly presenting
> the
>> work I've been doing on automatic GraphQL API generation from Calcite
> data
>> sources.
>> GraphQL is still somewhat niche though, so I'll hold off on asking to
> speak
>> about it unless there are folks attending who are interested/using
>> it.
> 
> 
> --
> Eugen Stan
> 
> +40770 941 271  / https://www.netdava.com
 
 
>> 



Re: Projection for SELECT COUNT(*)

2022-01-18 Thread Julian Hyde
As Stamatis said, we don’t have a consistent policy on zero-length records. But 
in that thread I logged https://issues.apache.org/jira/browse/CALCITE-4597 
 to clarify the situation. 
It would be great if someone worked on it.

I see Viliam’s point that it makes physical optimization easier if there is an 
explicit Project telling you which columns (if any) need to be read from the 
TableScan. AggregateExtractProjectRule [1] may make it easier to accomplish 
this. But in the usual case, when this rule is not enabled, I don’t think we 
should create a Project.

Julian

[1] 
https://github.com/apache/calcite/blob/d70583c4a8013f878457f82df6dffddd71875900/core/src/main/java/org/apache/calcite/rel/rules/AggregateExtractProjectRule.java#L53
 

 

> On Jan 15, 2022, at 2:07 PM, Stamatis Zampetakis  wrote:
> 
> Hi Viliam,
> 
> I don't see a problem with the current plan. It seems correct and more
> intuitive than the one with the DUMMY projection.
> 
> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
>LogicalTableScan(table=foo)
> 
> The code you cited in SqlToRelConverter seems an attempt to handle empty
> records/tuples that we are not handling very well in general [1].
> Doesn't seem related to performance as the use-case you mentioned.
> 
> Best,
> Stamatis
> 
> [1] https://lists.apache.org/thread/dtsz159x4nk3l9b3topgykqpsml024tv
> 
> On Fri, Jan 14, 2022 at 12:57 PM Viliam Durina  wrote:
> 
>> I noticed this two pieces of code:
>> 
>> 1. in SqlToRelConverter:
>> 
>>  if (preExprs.size() == 0) {
>>// Special case for COUNT(*), where we can end up with no inputs
>>// at all.  The rest of the system doesn't like 0-tuples, so we
>>// select a dummy constant here.
>>final RexNode zero = rexBuilder.makeExactLiteral(BigDecimal.ZERO);
>>preExprs = ImmutableList.of(Pair.of(zero, null));
>>  }
>> 
>> 2. in RelBuilder:
>> 
>>   // Some parts of the system can't handle rows with zero fields, so
>>  // pretend that one field is used.
>>  if (fieldsUsed.isEmpty()) {
>>r = ((Project) r).getInput();
>>  }
>> 
>> They run in this order, and the 2nd overrides the former. The end result is
>> that for query `SELECT COUNT(*) FROM foo`, the result of sql-to-rel
>> conversion is:
>> 
>> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
>>LogicalTableScan(table=foo)
>> 
>> instead of:
>> 
>> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
>>  LogicalProject(DUMMY=[0])
>>LogicalTableScan(table=foo)
>> 
>> In our implementation we push the projection to table scan. Without the
>> project, we fetch full rows, even though the aggregation uses no row.
>> 
>> The code was introduced in
>> https://issues.apache.org/jira/browse/CALCITE-3763, but maybe it was
>> broken
>> later.
>> 
>> Do you think this is an issue?
>> 
>> Viliam
>> 
>> --
>> This message contains confidential information and is intended only for
>> the
>> individuals named. If you are not the named addressee you should not
>> disseminate, distribute or copy this e-mail. Please notify the sender
>> immediately by e-mail if you have received this e-mail by mistake and
>> delete this e-mail from your system. E-mail transmission cannot be
>> guaranteed to be secure or error-free as information could be intercepted,
>> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
>> The sender therefore does not accept liability for any errors or omissions
>> in the contents of this message, which arise as a result of e-mail
>> transmission. If verification is required, please request a hard-copy
>> version. -Hazelcast
>> 



Re: [DISCUSS] Refactoring tests

2022-01-18 Thread Julian Hyde
I haven’t heard from anyone.

I plan to merge https://issues.apache.org/jira/browse/CALCITE-4986 
<https://issues.apache.org/jira/browse/CALCITE-4986> "Make HepProgram 
thread-safe” in the next hour or two, then I will squash/rebase/merge 
https://issues.apache.org/jira/browse/CALCITE-4885 
<https://issues.apache.org/jira/browse/CALCITE-4885> "Fluent test fixtures” 
tomorrow.

They’re both significant changes, review welcome.

Julian



> On Jan 12, 2022, at 10:11 AM, Julian Hyde  wrote:
> 
> After a month or so of work I have a PR for review. See 
> https://github.com/apache/calcite/pull/2685/ 
> <https://github.com/apache/calcite/pull/2685/>.
> 
> The PR is huge (80 commits, 127 files changed, 20k lines added/removed) so 
> may be hard to digest. I recommend that you read the commit log to understand 
> the various refactorings. Note that SqlOperatorBaseTest both moved (from main 
> to testkit) and changed name (to SqlOperatorTest) so diffing it might require 
> some ingenuity.
> 
> At this point I am especially interested in high-level comments (Does the 
> test/fixture/tester/factory split work? Are there any other candidates for 
> this refactoring?). If you have a lot of low-level comments (e.g. spelling 
> mistakes) just send me a PR.
> 
> It’s probably several days before I will squash, rebase and consider merging 
> to main.
> 
> Julian
>  
> 
>> On Nov 22, 2021, at 4:12 AM, Stamatis Zampetakis > <mailto:zabe...@gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I generally agree with the problems and goals set by CALCITE-4885 so
>> definitely worth pushing this forward.
>> 
>> I also like the ideas for being compatible with JUnit5 extensions etc., but
>> I guess it could be something to address later one and does not need to be
>> part of CALCITE-4885.
>> If and when somebody comes with a concrete proposal we can evaluate this.
>> 
>> Same reasoning goes for AssertJ; it may be beneficial for the project but
>> let's evaluate it separately.
>> 
>> If I understand well, both AssertJ and extensions via JUnit5 may come in
>> conflict with the new APIs exposed by CALCITE-4885 so it would be nice if
>> we can advance all these in the same release.
>> 
>> Last I wanted to mention that although people consuming Calcite releases
>> may not care much about big changes in the testing code, those who maintain
>> forks of the main repo will have a few more challenges to address when they
>> cherry-pick changes.
>> This is not an argument to push back the feature but just an additional
>> element for completing the picture.
>> 
>> Best,
>> Stamatis
>> 
>> On Wed, Nov 17, 2021 at 7:15 AM Vladimir Sitnikov <
>> sitnikov.vladi...@gmail.com <mailto:sitnikov.vladi...@gmail.com>> wrote:
>> 
>>> Jacques>This sounds like it will mean we will need to make calcite-core
>>> test artifacts available
>>> 
>>> Test artifacts publication is yet another anti-pattern just like "base test
>>> class".
>>> This change has been discussed:
>>> https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1 
>>> <https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1>
>>> 
>>> If you want to depend on a class from tests, consider moving it to /testkit
>>> module:
>>> 
>>> https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit
>>> 
>>> Jacques>We should think about the rules around Kotlin
>>> 
>>> What happens in calcite-core/tests stays in calcite-core/tests :)
>>> 
>>> It is reasonable to assume that testkit module would have dependencies,
>>> and testkit would provide API that is usable from Java and other JVM
>>> languages.
>>> 
>>> In that regard, Kotlin dependency in testkit is not much different from
>>> Quidem or commons-lang3.
>>> Consumers might use Quidem if it fits just like they could use Kotlin if it
>>> fits.
>>> 
>>> Vladimir
>>> 
> 



Re: [DISCUSS] RepeatUnion improvements

2022-01-17 Thread Julian Hyde
Thanks for the heads up. There was a good reason that we marked RepeatUnion 
experimental: so that we can make these improvements without anyone 
complaining. I welcome these changes evolving RepeatUnion into a more useful 
feature.

By the way, in Morel I am having some deep discussions about the best way to 
surface ‘recursive’ or ‘iterative’ or ‘fixed point’ queries should be surfaced 
in a relational/functional language [1]. Morel’s goal is to execute programs 
both locally and via Calcite relational algebra, so Morel will at some point 
use RepeatUnion.

Julian

[1] https://github.com/julianhyde/morel/issues/81 
   

> On Jan 17, 2022, at 8:49 AM, Ruben Q L  wrote:
> 
> Hello everyone,
> 
> I am not sure if anyone else (apart from my downstream project) is using
> RepeatUnion and associated operators to build recursive unions
> (experimental feature implemented via [1]), but just in case: I am
> considering a patch to fix some known issues ([2] & [3]) on the current
> implementation.
> This patch might potentially break the current contract of this operator,
> so if you think that you may be impacted by this change, please do not
> hesitate to take a look at the PR [4] and contribute to the discussion in
> there.
> 
> Thanks and best regards,
> Ruben
> 
> [1] https://issues.apache.org/jira/browse/CALCITE-2812
> [2] https://issues.apache.org/jira/browse/CALCITE-3673
> [3] https://issues.apache.org/jira/browse/CALCITE-4054
> [4] https://github.com/apache/calcite/pull/2690



Re: [DISCUSS] Writing good summaries for Jira cases

2022-01-14 Thread Julian Hyde
ets
>>>> 
>>>> Viliam
>>>> 
>>>> On Wed, 12 Jan 2022 at 20:38, Vladimir Ozerov 
>>> wrote:
>>>> 
>>>>> Hi Julian,
>>>>> 
>>>>> In my opinion, both ways work well. People tend to think differently.
>>>> Some
>>>>> prefer symptoms, others - the root cause. I personally prefer the
>>> latter
>>>>> for the following reason. If I face a problem, I first try to debug
>> it
>>> on
>>>>> my own. The result of the analysis is usually some questionable
>>> behavior
>>>> in
>>>>> a specific part of the code. Once you find the problematic place, you
>>> can
>>>>> run a search in JIRA or Git log (class name, feature name, etc) and
>>> check
>>>>> whether somebody else faced a similar issue. The description
>> "Incorrect
>>>>> plan ..." is less likely to help me than more concrete "In
>>>>> SubstitutionVisitor ...". Especially, given that a single root cause
>>> may
>>>>> manifest in several ways. But I would like to stress out - it is a
>>> matter
>>>>> of personal habits and previous experience, not something that I
>>>>> expect others to follow.
>>>>> 
>>>>> In the past, I worked on the Apache Ignite project. We had a number
>> of
>>>>> contribution rules, such as "put a comma here", "set the proper
>>> component
>>>>> there", "write the comment in that way", etc. I was the one who
>>> actively
>>>>> enforced this for a may years, because it gave the feeling that
>>>> everything
>>>>> is "put in order". Eventually, I came to the conclusion that this
>> does
>>>> more
>>>>> harm than good, because I regularly observed confusion and
>>>> dissatisfaction
>>>>> of the new contributors (and Apache Ignite community is far less
>>> diverse
>>>>> and active than in Apache Calcite), as they were forced to change
>> their
>>>>> natural way of thinking or past habits to engage with the community.
>>>>> 
>>>>> Regards,
>>>>> Vladimir.
>>>>> 
>>>>> ср, 12 янв. 2022 г. в 21:42, Julian Hyde :
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> Can we discuss how we write summaries for Jira cases? In my opinion
>>>> it’s
>>>>>> really important, because summaries become commit messages, and
>>> commit
>>>>>> messages become release notes, which is how most people figure out
>>> what
>>>>> is
>>>>>> in Calcite. I spend a lot of my time working with people to write
>>> good
>>>>>> summaries.
>>>>>> 
>>>>>> I’d like some feedback on whether this approach is useful. And to
>> try
>>>> to
>>>>>> teach people how to do it for themselves.
>>>>>> 
>>>>>> Consider this case
>>> https://issues.apache.org/jira/browse/CALCITE-4983
>>>> <
>>>>>> https://issues.apache.org/jira/browse/CALCITE-4983>. (I chose a
>>> still
>>>>>> current case because it doesn’t yet have an ‘answer’.)
>>>>>> 
>>>>>> The current summary is
>>>>>> 
>>>>>>> In SubstitutionVisitor's unifyAggregates, if Aggregate has
>>>>>>> grouping sets, we need to handle the condition needs to pull up.
>>>>>> 
>>>>>> It describes the cause but it doesn’t describe the problem (or the
>>>>>> symptoms the user sees).
>>>>>> 
>>>>>> If you take your car into your mechanic the cause is ‘Leaky gasket
>>>>> results
>>>>>> in oil dripping onto hot manifold’ but the problem is ‘Smoke comes
>>> from
>>>>>> hood when engine gets hot’. Do you agree that the second
>> description
>>> is
>>>>>> much more useful?
>>>>>> 
>>>>>> In this case, the author came up with an example:
>>>>>> 
>>>>>>> Here is an example:
>>>>>>> 
>>>>>>> sql: select empid, deptno from emps group by grouping sets
>> ((empid,
>>>>>> deptno),(empid))
>>>>>>> mv: select empid, count(distinct deptno) from emps where
>> empid>100
>>>>>>>  group by grouping sets ((empid, deptno), (empid))
>>>>>>> 
>>>>>>> the result plan is:
>>>>>>> 
>>>>>>>  LogicalCalc(expr#0..2=[{inputs}], deptno=[$t1], EXPR$1=[$t2])
>>>>>>>LogicalAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}]],
>>>>>> EXPR$1=[COUNT(DISTINCT $1)])
>>>>>>>  EnumerableTableScan(table=[[hr, MV0]])
>>>>>>> 
>>>>>>> We can see that this plan doesn't handle the condition empid>100
>>>>>> 
>>>>>> I think it’s a great example. I especially like the last line,
>> where
>>>> the
>>>>>> author pointed out what was wrong. I suggest the following summary:
>>>>>> 
>>>>>>> Incorrect plan for query that has GROUPING SETS and WHERE
>>>>>> 
>>>>>> Do you think the summary is more useful? Can it be improved?
>>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> --
>>>> This message contains confidential information and is intended only for
>>>> the
>>>> individuals named. If you are not the named addressee you should not
>>>> disseminate, distribute or copy this e-mail. Please notify the sender
>>>> immediately by e-mail if you have received this e-mail by mistake and
>>>> delete this e-mail from your system. E-mail transmission cannot be
>>>> guaranteed to be secure or error-free as information could be
>>> intercepted,
>>>> corrupted, lost, destroyed, arrive late or incomplete, or contain
>>> viruses.
>>>> The sender therefore does not accept liability for any errors or
>>> omissions
>>>> in the contents of this message, which arise as a result of e-mail
>>>> transmission. If verification is required, please request a hard-copy
>>>> version. -Hazelcast
>>>> 
>>> 
>> 



[jira] [Created] (CALCITE-4986) Make HepProgram thread-safe

2022-01-12 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-4986:


 Summary: Make HepProgram thread-safe
 Key: CALCITE-4986
 URL: https://issues.apache.org/jira/browse/CALCITE-4986
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde


{{HepProgram}} uses mutable fields for its working state and is therefore not 
thread-safe. If two threads are planning queries simultaneously and are using 
the same {{HepProgram}} instances they might conflict with each other. I 
suspect that this happens several times in Calcite's standard set of rules.

This change would move the mutable state out of {{HepProgram}}, so that 
{{HepProgram}} is fully immutable and therefore thread-safe.

It would also modernize some of the internals; for instance, {{class 
HepInstruction.MatchLimit}} has a field {{int limit}} that could be made final 
and assigned in the constructor. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[DISCUSS] Writing good summaries for Jira cases

2022-01-12 Thread Julian Hyde
Hi all,

Can we discuss how we write summaries for Jira cases? In my opinion it’s really 
important, because summaries become commit messages, and commit messages become 
release notes, which is how most people figure out what is in Calcite. I spend 
a lot of my time working with people to write good summaries.

I’d like some feedback on whether this approach is useful. And to try to teach 
people how to do it for themselves.

Consider this case https://issues.apache.org/jira/browse/CALCITE-4983 
. (I chose a still current 
case because it doesn’t yet have an ‘answer’.)

The current summary is

>  In SubstitutionVisitor's unifyAggregates, if Aggregate has
>  grouping sets, we need to handle the condition needs to pull up.

It describes the cause but it doesn’t describe the problem (or the symptoms the 
user sees).

If you take your car into your mechanic the cause is ‘Leaky gasket results in 
oil dripping onto hot manifold’ but the problem is ‘Smoke comes from hood when 
engine gets hot’. Do you agree that the second description is much more useful?

In this case, the author came up with an example:

> Here is an example:
>
> sql: select empid, deptno from emps group by grouping sets ((empid, 
> deptno),(empid))
> mv: select empid, count(distinct deptno) from emps where empid>100
>   group by grouping sets ((empid, deptno), (empid)) 
>
> the result plan is:
>
>   LogicalCalc(expr#0..2=[{inputs}], deptno=[$t1], EXPR$1=[$t2])
> LogicalAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}]], 
> EXPR$1=[COUNT(DISTINCT $1)])
>   EnumerableTableScan(table=[[hr, MV0]])  
>
> We can see that this plan doesn't handle the condition empid>100

I think it’s a great example. I especially like the last line, where the author 
pointed out what was wrong. I suggest the following summary:

> Incorrect plan for query that has GROUPING SETS and WHERE

Do you think the summary is more useful? Can it be improved?

Julian





Re: [DISCUSS] Refactoring tests

2022-01-12 Thread Julian Hyde
After a month or so of work I have a PR for review. See 
https://github.com/apache/calcite/pull/2685/ 
.

The PR is huge (80 commits, 127 files changed, 20k lines added/removed) so may 
be hard to digest. I recommend that you read the commit log to understand the 
various refactorings. Note that SqlOperatorBaseTest both moved (from main to 
testkit) and changed name (to SqlOperatorTest) so diffing it might require some 
ingenuity.

At this point I am especially interested in high-level comments (Does the 
test/fixture/tester/factory split work? Are there any other candidates for this 
refactoring?). If you have a lot of low-level comments (e.g. spelling mistakes) 
just send me a PR.

It’s probably several days before I will squash, rebase and consider merging to 
main.

Julian
 

> On Nov 22, 2021, at 4:12 AM, Stamatis Zampetakis  wrote:
> 
> Hello,
> 
> I generally agree with the problems and goals set by CALCITE-4885 so
> definitely worth pushing this forward.
> 
> I also like the ideas for being compatible with JUnit5 extensions etc., but
> I guess it could be something to address later one and does not need to be
> part of CALCITE-4885.
> If and when somebody comes with a concrete proposal we can evaluate this.
> 
> Same reasoning goes for AssertJ; it may be beneficial for the project but
> let's evaluate it separately.
> 
> If I understand well, both AssertJ and extensions via JUnit5 may come in
> conflict with the new APIs exposed by CALCITE-4885 so it would be nice if
> we can advance all these in the same release.
> 
> Last I wanted to mention that although people consuming Calcite releases
> may not care much about big changes in the testing code, those who maintain
> forks of the main repo will have a few more challenges to address when they
> cherry-pick changes.
> This is not an argument to push back the feature but just an additional
> element for completing the picture.
> 
> Best,
> Stamatis
> 
> On Wed, Nov 17, 2021 at 7:15 AM Vladimir Sitnikov <
> sitnikov.vladi...@gmail.com> wrote:
> 
>> Jacques>This sounds like it will mean we will need to make calcite-core
>> test artifacts available
>> 
>> Test artifacts publication is yet another anti-pattern just like "base test
>> class".
>> This change has been discussed:
>> https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1
>> 
>> If you want to depend on a class from tests, consider moving it to /testkit
>> module:
>> 
>> https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit
>> 
>> Jacques>We should think about the rules around Kotlin
>> 
>> What happens in calcite-core/tests stays in calcite-core/tests :)
>> 
>> It is reasonable to assume that testkit module would have dependencies,
>> and testkit would provide API that is usable from Java and other JVM
>> languages.
>> 
>> In that regard, Kotlin dependency in testkit is not much different from
>> Quidem or commons-lang3.
>> Consumers might use Quidem if it fits just like they could use Kotlin if it
>> fits.
>> 
>> Vladimir
>> 



Re: [DISCUSS] Apache Calcite Online Meetup January 2022

2022-01-12 Thread Julian Hyde
I’d love to hear about your GraphQL implementation. (If the organizers think 
there is enough time in the meetup.)

I don’t think of GraphQL as “niche” at all. It’s a key tool for many, many app 
developers. It’s just that, despite the “QL” in the name, it’s not a 
conventional “query language”.

Julian

> On Jan 12, 2022, at 07:37, Gavin Ray  wrote:
> 
> If anyone is interested, I can spend 5-10 minutes briefly presenting the
> work I've been doing on automatic GraphQL API generation from Calcite data
> sources.
> GraphQL is still somewhat niche though, so I'll hold off on asking to speak
> about it unless there are folks attending who are interested/using it.
> 
> 
> 
>> On Tue, Jan 11, 2022 at 4:13 PM Ioan Eugen Stan  wrote:
>> 
>> Hello Stamatis,
>> 
>> I would also like to share my work.
>> 
>> Title: calcite-clj - First steps using Calcite with Clojure
>> 
>> Abstract: A quick introduction on using Calcite with Clojure language.
>> I'll go through an simple example on how to use in memory table.
>> The example will cover REPL driven development.
>> If I have time I will showcase Apache OFBiz entity engine on top of Apache
>> Calcite and how to query OFBIz xml files (OFBiz data exports) using the
>> same engine (ongoing work).
>> 
>> Duration: 15 minutes
>> 
>> 


Re: Different behavior bewteen '>' and '='

2022-01-10 Thread Julian Hyde
Yes, this is by design.I believe that the SQL standard set the rules.

It’s not that surprising that ‘=‘ has different behavior than ordering-based 
comparisons such as ‘>’. Consider: given a DATE value d, and a TIMESTAMP value 
t, it is reasonable to ask ‘is t > d?’ but less reasonable to ask ‘does t = d?'

> On Jan 10, 2022, at 6:35 PM, Zou Dan  wrote:
> 
> Hi community,
> I recently ran into a problem that when we disable type coercion by 
> SqlValidator#setEnableTypeCoercion(false),
> there will be two different behaviors between '>' and '=':
> 1. '>' between character and numeric (e.g. '1' > 1), the character will be 
> implicitly converted to numeric
> 2. '=' between character and numeric (e.g. '1' = 1), the character will `not` 
> be implicitly converted to numeric
> I find the reason is that the SqlOperandTypeChecker.Consistency for 
> SqlStdOperatorTable.GREATER_THAN is `COMPARE` while
> SqlStdOperatorTable.EQUALS is `LEAST_RESTRICTIVE`.
> Is this by design?



<    2   3   4   5   6   7   8   9   10   11   >