[jira] [Created] (CALCITE-6280) The Jetty's version number leak occurred while using the query sever
Vaibhav Joshi created CALCITE-6280: -- Summary: The Jetty's version number leak occurred while using the query sever Key: CALCITE-6280 URL: https://issues.apache.org/jira/browse/CALCITE-6280 Project: Calcite Issue Type: Bug Components: avatica Reporter: Vaibhav Joshi Assignee: Vaibhav Joshi Unauthorised access to HTTP server using curl returns the Jerry server version. See sample response below {code:java} Error 401 Unauthorized HTTP ERROR 401 Unauthorized URI:/ STATUS:401 MESSAGE:Unauthorized SERVLET:- https://eclipse.org/jetty;>Powered by Jetty:// 9.4.44.v20210927 {code} For security reason, it's not advisable to return server version in the response. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
I think it is difficult and impossible to make calcite function results completely consistent with other SQL systems. In practice, when calculations are pushed down to different data source adapters for execution, there is no guarantee that the final results will be completely consistent in calcite. I think, for the built-in functions defined in `SqlStdOperatorTable`, we can choose to follow one of the definitive SQL systems, such as Oracle. For the engine-related extension functions in `SqlLibraryOperators`, we can follow the behavior related to the specific engine. If an already defined function behaves inconsistently with the new engine, we can add a new Function to the definition, such as the `CONCAT` function. At the same time, I agree it is a great suggestion to add execution tests of different engines in Quidem, which is very effective in verifying the engine execution behavior. Cancai Cai 于2024年2月26日周一 09:59写道: > Thank you very much to the calcite community for raising these questions. > This is what I have been doubting. I am very sorry that this doubt has been > discussed for so long. > > Maybe we also need to consider another issue, that is, the database version > issue. Versions like mysql and postgres are very stable, but components > like spark still seem to have function bugs. exists, then how should we > consider version issues? > > I don't know what I can do. Recently I am sorting out some documents about > the use of some functions of mysql and postgres in calcite. I don't know if > this is helpful. > > Best wishes, > Cancai Cai >
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
Thank you very much to the calcite community for raising these questions. This is what I have been doubting. I am very sorry that this doubt has been discussed for so long. Maybe we also need to consider another issue, that is, the database version issue. Versions like mysql and postgres are very stable, but components like spark still seem to have function bugs. exists, then how should we consider version issues? I don't know what I can do. Recently I am sorting out some documents about the use of some functions of mysql and postgres in calcite. I don't know if this is helpful. Best wishes, Cancai Cai
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
SQL logic test has conditionals. So we could encode each database response (including the one from Calcite) On Sun, Feb 25, 2024, 3:33 PM Julian Hyde wrote: > Does SQL Logic Test have an opinion about what SQRT(-1) should return? > If Calcite is pretending to be Oracle, it should return NaN. If it is > pretending to be MySQL it should return NULL. AFAICT, SQL Logic Test > would regard (at most) one of those responses as correct. If it even > covers SQRT. > > On Sun, Feb 25, 2024 at 12:25 PM Guillaume Masse > wrote: > > > > Write test with SQL logic test > > > > https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > > > > Then run those tests against each database. (This technique is called > > oracle testing) > > > > > > On Sun, Feb 25, 2024, 3:13 PM Julian Hyde wrote: > > > > > TL;DR: We need to add software engineering processes to ensure that > > > Calcite is consistent with other SQL systems. What should those be? > > > > > > There has been a lot of activity recently adding functions such as > > > LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] > > > are consistent with other implementations. Another case was about > > > whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. > > > > > > These discussions have been time-consuming. For example, the LOG2 PR > > > received almost a hundred comments, and that is a function that should > > > be simple to specify and implement. If a DB did not have LOG2, instead > > > of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. > > > > > > In one of the discussions a statement was made that "SQL rules say > > > that SQRT should return NULL for negative values". I checked Oracle -- > > > the definitive SQL implementation for many years -- and its doc says > > > that it returns NaN. > > > > > > In another incident[4], the TO_CHAR function was removed from the > > > MySQL library and then added back when someone pointed out that it is > > > present in MariaDB, and our MySQL library covers MariaDB. > > > > > > We just can't keep doing this. We need to quickly and efficiently > > > arrive at the correct specification for when Calcite is attempting to > > > emulate other SQL systems, and then we need to stay with that > > > specification. I propose that we use software engineering. > > > > > > One idea is to create a Quidem script that can be run against MySQL, > > > and passes, and when run against Calcite-pretending-to-be-MySQL, it > > > also passes. Anyone adding to that script must test it against a MySQL > > > instance before merging to main. And similarly for other SQL dialects. > > > > > > As we encounter underspecified areas - e.g. the behavior of SQRT when > > > applied to -0 - we can resolve them by adding to that script. > > > > > > Does anyone have any ideas for how we can solve this using automated > > > tools, backed by changes to our process, rather than by endless > > > discussions? > > > > > > (By the way, this discussion relates to libraries (the "fun" > > > parameter), conformance and our parser settings (the "lex" parameter, > > > quoted and unquoted casing, etc.). We could have a similar discussion > > > about dialects - the SQL generated by the JDBC adapter - but please > > > let's have that discussion in another thread.) > > > > > > Julian > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-6224 > > > [2] https://issues.apache.org/jira/browse/CALCITE-5638 > > > [3] https://issues.apache.org/jira/browse/CALCITE-6210 > > > [4] https://issues.apache.org/jira/browse/CALCITE-6222 > > > >
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
I think leveraging testContainers ( https://java.testcontainers.org/quickstart/junit_4_quickstart/) could automate the process effectively. TestContainers provides modules for different databases such as MySQL, Oracle ( https://github.com/testcontainers/testcontainers-java/tree/main/modules), among others. I haven't fully explored its capabilities, so I am not sure about the creation time for each container etc. Nonetheless, I believe it might be a potential candidate for thorough testing and comparison across diverse databases. On Sun, Feb 25, 2024 at 12:33 PM Julian Hyde wrote: > Does SQL Logic Test have an opinion about what SQRT(-1) should return? > If Calcite is pretending to be Oracle, it should return NaN. If it is > pretending to be MySQL it should return NULL. AFAICT, SQL Logic Test > would regard (at most) one of those responses as correct. If it even > covers SQRT. > > On Sun, Feb 25, 2024 at 12:25 PM Guillaume Masse > wrote: > > > > Write test with SQL logic test > > > > https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > > > > Then run those tests against each database. (This technique is called > > oracle testing) > > > > > > On Sun, Feb 25, 2024, 3:13 PM Julian Hyde wrote: > > > > > TL;DR: We need to add software engineering processes to ensure that > > > Calcite is consistent with other SQL systems. What should those be? > > > > > > There has been a lot of activity recently adding functions such as > > > LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] > > > are consistent with other implementations. Another case was about > > > whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. > > > > > > These discussions have been time-consuming. For example, the LOG2 PR > > > received almost a hundred comments, and that is a function that should > > > be simple to specify and implement. If a DB did not have LOG2, instead > > > of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. > > > > > > In one of the discussions a statement was made that "SQL rules say > > > that SQRT should return NULL for negative values". I checked Oracle -- > > > the definitive SQL implementation for many years -- and its doc says > > > that it returns NaN. > > > > > > In another incident[4], the TO_CHAR function was removed from the > > > MySQL library and then added back when someone pointed out that it is > > > present in MariaDB, and our MySQL library covers MariaDB. > > > > > > We just can't keep doing this. We need to quickly and efficiently > > > arrive at the correct specification for when Calcite is attempting to > > > emulate other SQL systems, and then we need to stay with that > > > specification. I propose that we use software engineering. > > > > > > One idea is to create a Quidem script that can be run against MySQL, > > > and passes, and when run against Calcite-pretending-to-be-MySQL, it > > > also passes. Anyone adding to that script must test it against a MySQL > > > instance before merging to main. And similarly for other SQL dialects. > > > > > > As we encounter underspecified areas - e.g. the behavior of SQRT when > > > applied to -0 - we can resolve them by adding to that script. > > > > > > Does anyone have any ideas for how we can solve this using automated > > > tools, backed by changes to our process, rather than by endless > > > discussions? > > > > > > (By the way, this discussion relates to libraries (the "fun" > > > parameter), conformance and our parser settings (the "lex" parameter, > > > quoted and unquoted casing, etc.). We could have a similar discussion > > > about dialects - the SQL generated by the JDBC adapter - but please > > > let's have that discussion in another thread.) > > > > > > Julian > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-6224 > > > [2] https://issues.apache.org/jira/browse/CALCITE-5638 > > > [3] https://issues.apache.org/jira/browse/CALCITE-6210 > > > [4] https://issues.apache.org/jira/browse/CALCITE-6222 > > > >
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
Does SQL Logic Test have an opinion about what SQRT(-1) should return? If Calcite is pretending to be Oracle, it should return NaN. If it is pretending to be MySQL it should return NULL. AFAICT, SQL Logic Test would regard (at most) one of those responses as correct. If it even covers SQRT. On Sun, Feb 25, 2024 at 12:25 PM Guillaume Masse wrote: > > Write test with SQL logic test > > https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki > > Then run those tests against each database. (This technique is called > oracle testing) > > > On Sun, Feb 25, 2024, 3:13 PM Julian Hyde wrote: > > > TL;DR: We need to add software engineering processes to ensure that > > Calcite is consistent with other SQL systems. What should those be? > > > > There has been a lot of activity recently adding functions such as > > LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] > > are consistent with other implementations. Another case was about > > whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. > > > > These discussions have been time-consuming. For example, the LOG2 PR > > received almost a hundred comments, and that is a function that should > > be simple to specify and implement. If a DB did not have LOG2, instead > > of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. > > > > In one of the discussions a statement was made that "SQL rules say > > that SQRT should return NULL for negative values". I checked Oracle -- > > the definitive SQL implementation for many years -- and its doc says > > that it returns NaN. > > > > In another incident[4], the TO_CHAR function was removed from the > > MySQL library and then added back when someone pointed out that it is > > present in MariaDB, and our MySQL library covers MariaDB. > > > > We just can't keep doing this. We need to quickly and efficiently > > arrive at the correct specification for when Calcite is attempting to > > emulate other SQL systems, and then we need to stay with that > > specification. I propose that we use software engineering. > > > > One idea is to create a Quidem script that can be run against MySQL, > > and passes, and when run against Calcite-pretending-to-be-MySQL, it > > also passes. Anyone adding to that script must test it against a MySQL > > instance before merging to main. And similarly for other SQL dialects. > > > > As we encounter underspecified areas - e.g. the behavior of SQRT when > > applied to -0 - we can resolve them by adding to that script. > > > > Does anyone have any ideas for how we can solve this using automated > > tools, backed by changes to our process, rather than by endless > > discussions? > > > > (By the way, this discussion relates to libraries (the "fun" > > parameter), conformance and our parser settings (the "lex" parameter, > > quoted and unquoted casing, etc.). We could have a similar discussion > > about dialects - the SQL generated by the JDBC adapter - but please > > let's have that discussion in another thread.) > > > > Julian > > > > [1] https://issues.apache.org/jira/browse/CALCITE-6224 > > [2] https://issues.apache.org/jira/browse/CALCITE-5638 > > [3] https://issues.apache.org/jira/browse/CALCITE-6210 > > [4] https://issues.apache.org/jira/browse/CALCITE-6222 > >
Re: [DISCUSS] Ensuring that Calcite is consistent with other SQL systems
Write test with SQL logic test https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki Then run those tests against each database. (This technique is called oracle testing) On Sun, Feb 25, 2024, 3:13 PM Julian Hyde wrote: > TL;DR: We need to add software engineering processes to ensure that > Calcite is consistent with other SQL systems. What should those be? > > There has been a lot of activity recently adding functions such as > LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] > are consistent with other implementations. Another case was about > whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. > > These discussions have been time-consuming. For example, the LOG2 PR > received almost a hundred comments, and that is a function that should > be simple to specify and implement. If a DB did not have LOG2, instead > of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. > > In one of the discussions a statement was made that "SQL rules say > that SQRT should return NULL for negative values". I checked Oracle -- > the definitive SQL implementation for many years -- and its doc says > that it returns NaN. > > In another incident[4], the TO_CHAR function was removed from the > MySQL library and then added back when someone pointed out that it is > present in MariaDB, and our MySQL library covers MariaDB. > > We just can't keep doing this. We need to quickly and efficiently > arrive at the correct specification for when Calcite is attempting to > emulate other SQL systems, and then we need to stay with that > specification. I propose that we use software engineering. > > One idea is to create a Quidem script that can be run against MySQL, > and passes, and when run against Calcite-pretending-to-be-MySQL, it > also passes. Anyone adding to that script must test it against a MySQL > instance before merging to main. And similarly for other SQL dialects. > > As we encounter underspecified areas - e.g. the behavior of SQRT when > applied to -0 - we can resolve them by adding to that script. > > Does anyone have any ideas for how we can solve this using automated > tools, backed by changes to our process, rather than by endless > discussions? > > (By the way, this discussion relates to libraries (the "fun" > parameter), conformance and our parser settings (the "lex" parameter, > quoted and unquoted casing, etc.). We could have a similar discussion > about dialects - the SQL generated by the JDBC adapter - but please > let's have that discussion in another thread.) > > Julian > > [1] https://issues.apache.org/jira/browse/CALCITE-6224 > [2] https://issues.apache.org/jira/browse/CALCITE-5638 > [3] https://issues.apache.org/jira/browse/CALCITE-6210 > [4] https://issues.apache.org/jira/browse/CALCITE-6222 >
[DISCUSS] Ensuring that Calcite is consistent with other SQL systems
TL;DR: We need to add software engineering processes to ensure that Calcite is consistent with other SQL systems. What should those be? There has been a lot of activity recently adding functions such as LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] are consistent with other implementations. Another case was about whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. These discussions have been time-consuming. For example, the LOG2 PR received almost a hundred comments, and that is a function that should be simple to specify and implement. If a DB did not have LOG2, instead of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. In one of the discussions a statement was made that "SQL rules say that SQRT should return NULL for negative values". I checked Oracle -- the definitive SQL implementation for many years -- and its doc says that it returns NaN. In another incident[4], the TO_CHAR function was removed from the MySQL library and then added back when someone pointed out that it is present in MariaDB, and our MySQL library covers MariaDB. We just can't keep doing this. We need to quickly and efficiently arrive at the correct specification for when Calcite is attempting to emulate other SQL systems, and then we need to stay with that specification. I propose that we use software engineering. One idea is to create a Quidem script that can be run against MySQL, and passes, and when run against Calcite-pretending-to-be-MySQL, it also passes. Anyone adding to that script must test it against a MySQL instance before merging to main. And similarly for other SQL dialects. As we encounter underspecified areas - e.g. the behavior of SQRT when applied to -0 - we can resolve them by adding to that script. Does anyone have any ideas for how we can solve this using automated tools, backed by changes to our process, rather than by endless discussions? (By the way, this discussion relates to libraries (the "fun" parameter), conformance and our parser settings (the "lex" parameter, quoted and unquoted casing, etc.). We could have a similar discussion about dialects - the SQL generated by the JDBC adapter - but please let's have that discussion in another thread.) Julian [1] https://issues.apache.org/jira/browse/CALCITE-6224 [2] https://issues.apache.org/jira/browse/CALCITE-5638 [3] https://issues.apache.org/jira/browse/CALCITE-6210 [4] https://issues.apache.org/jira/browse/CALCITE-6222
Re: Reviewing blog repository
so cool , it will be very useful for beginners and users. thank you very much Best regards, Cancai cai Zhengqiang Duan 于2024年2月25日周日 19:08写道: > @Jiajun, thank you very much for putting together such a great study > material. > > Best regards, > Zhengqiang > > Jiajun Xie 于2024年2月25日周日 17:04写道: > > > > Hello all, > > > > When I read blogs about Calcite, > > I always want to replicate other people's examples on my own. > > > > Constructing examples requires a lot of pre-work, > > which can be time-consuming for newcomers. > > > > So I built a repository that combines code and blogs. > > Everyone can read blogs and debug code directly without any additional > > preparation work. > > > > Now I have the first chapter: > > https://github.com/JiajunBernoulli/calcite-notes/tree/cbo > > Welcome to comment on this PR: > > https://github.com/JiajunBernoulli/calcite-notes/pull/1 >
Re: Reviewing blog repository
@Jiajun, thank you very much for putting together such a great study material. Best regards, Zhengqiang Jiajun Xie 于2024年2月25日周日 17:04写道: > > Hello all, > > When I read blogs about Calcite, > I always want to replicate other people's examples on my own. > > Constructing examples requires a lot of pre-work, > which can be time-consuming for newcomers. > > So I built a repository that combines code and blogs. > Everyone can read blogs and debug code directly without any additional > preparation work. > > Now I have the first chapter: > https://github.com/JiajunBernoulli/calcite-notes/tree/cbo > Welcome to comment on this PR: > https://github.com/JiajunBernoulli/calcite-notes/pull/1
Re: Reviewing blog repository
If this work can be completed, it will be very useful for beginners and users. Jiajun Xie 于2024年2月25日周日 17:04写道: > Hello all, > > When I read blogs about Calcite, > I always want to replicate other people's examples on my own. > > Constructing examples requires a lot of pre-work, > which can be time-consuming for newcomers. > > So I built a repository that combines code and blogs. > Everyone can read blogs and debug code directly without any additional > preparation work. > > Now I have the first chapter: > https://github.com/JiajunBernoulli/calcite-notes/tree/cbo > Welcome to comment on this PR: > https://github.com/JiajunBernoulli/calcite-notes/pull/1 >
Reviewing blog repository
Hello all, When I read blogs about Calcite, I always want to replicate other people's examples on my own. Constructing examples requires a lot of pre-work, which can be time-consuming for newcomers. So I built a repository that combines code and blogs. Everyone can read blogs and debug code directly without any additional preparation work. Now I have the first chapter: https://github.com/JiajunBernoulli/calcite-notes/tree/cbo Welcome to comment on this PR: https://github.com/JiajunBernoulli/calcite-notes/pull/1