Re: Towards Avatica-Go 5.2.0

2022-10-07 Thread Julian Hyde
+1

Thanks Francis.

> On Oct 7, 2022, at 1:55 AM, Francis Chuang  wrote:
> 
> Hey everyone,
> 
> I think it's a good time to release Avatica-Go 5.2.0 as 5.1.0 was released 
> around 7 months ago. This should be a straight-forward release and I'll be 
> making rc0 available for voting early next week.
> 
> If you have any questions or would like to get a certain change in the 
> release, please let me know.
> 
> Francis



Re: Pushing filters with unsafe operations (CAST/DIVISION) below inner joins

2022-10-07 Thread Julian Hyde
Thanks for logging this. I agree that it’s not high priority. I added some 
commentary about related bugs.

> On Oct 7, 2022, at 9:07 AM, Stamatis Zampetakis  wrote:
> 
> I just logged https://issues.apache.org/jira/browse/CALCITE-5315 but I am
> not planning to work on this sometime soon.
> 
> Best,
> Stamatis
> 
> On Fri, Feb 18, 2022 at 10:56 PM Stamatis Zampetakis 
> wrote:
> 
>> Thank you all for the feedback.
>> 
>> I will do a small research about the proper term to use for functions that
>> might throw / never throw and log a JIRA to move the discussion further.
>> 
>> @Viliam : Note that the problem you mention, although similar, is not
>> exactly the same as the one I brought up now.
>> 
>> The reordering of predicates in the WHERE clause has been brought up quite
>> a few times in the dev list and our stance [1] is that since the standard
>> leaves this decision to the implementor people should not rely on this (or
>> put the appropriate guards).
>> 
>> The case here is a bit different, at least the way I read the standard,
>> cause it defines the following:
>> 
>> "If all optional clauses are omitted, then the result of the > expression> is the same as the result of the
>> . Otherwise, each specified clause is applied to the result
>> of the previously specified clause
>> and the result of the  is the result of the application
>> of the last specified clause."
>> 
>> and one of the optional clauses mentioned in the previous paragraph is the
>> . There seems to be a clearly defined order between the > clause>, which includes inner joins, and the .
>> 
>> Best,
>> Stamatis
>> 
>> [1] https://lists.apache.org/thread/mq44cnrohz19hh10btms126vbcoxl50w
>> 
>> On Fri, Feb 18, 2022 at 9:58 AM Viliam Durina 
>> wrote:
>> 
>>> I have observed this issue years ago in well-known databases. My case was
>>> much simpler:
>>> 
>>> data
>>> recordType:int  value:text
>>> -   --
>>> 0   1
>>> 1   a
>>> 
>>> SELECT *
>>> FROM data
>>> WHERE recordType='1' AND CAST(value AS INT)<10
>>> 
>>> 
>>> SQL is declarative, and unlike procedural languages, it doesn't prescribe
>>> short-circuit evaluation of the WHERE clause, or any specific evaluation
>>> order. If it was prescribed, the query would be perfectly safe. But
>>> prescribing the evaluation order would rule out many optimizations, or
>>> make
>>> them much harder, such as this half-null-half-error value.
>>> 
>>> For example, reordering additions might or might not lead to overflow:
>>> TINYINTs `100 + 100 - 90`, evaluated in this order, overflow, but `100 -
>>> 90
>>> + 100` don't - imagine each value comes from a different table and we
>>> reorder the joins. Perhaps result of TINYINT addition can be SMALLINT, but
>>> what if they are BIGINTs?
>>> 
>>> My understanding was that any expression must not fail in any plausible
>>> execution plan. Therefore if I parse the column `value`, it must succeed
>>> for every possible scanned row. In my specific case I ended up
>>> implementing
>>> a custom TO_NUMBER function that returns null on parsing error, and that
>>> null never appeared in the result because of the other condition.
>>> 
>>> Viliam
>>> 
>>> On Thu, 17 Feb 2022 at 20:08, Julian Hyde  wrote:
>>> 
 You should definitely log a bug, even if there are no plans to fix it.
>>> It
 is a violation of the standard, and therefore it is a bug, and
>>> therefore we
 should document that it exists.
 
 Can you do some research on the right terminology. You use the term
 ‘unsafe’. I think the PL community uses the term ’strict’ for
>>> expressions
 that must be evaluated (and therefore if they throw, the query is sure
>>> to
 abort). We have other related concepts floating around in Calcite -
>>> dynamic
 functions, deterministic operators - and standardizing terminology is
>>> key
 if we are to keep the menagerie in order.
 
 There might be a way to have our cake and eat it too. We could push down
 the division, and if there is a division-by-zero we return a special
>>> value
 such as NULL. If the special value is not eliminated by the join then
>>> it is
 promoted to a throw. The “cake” here is the performance benefit of
>>> pushing
 down a filter (that may never or rarely throw); the “eating it” is
>>> safety
 on the occasion that the filter does throw.
 
 Even if that strategy doesn’t work, maybe we could have a flag that says
 whether to push down conditions that might throw.
 
 Julian
 
 
> On Feb 17, 2022, at 8:07 AM, Scott Reynolds 
 wrote:
> 
> Is it feasible to prevent the filter push down in cases like this (
 detect
> potential division by zero) ? What other exceptions that could happen?
> 
> Should it only push down some filters and preserve the complex filter
 above
> the join?
> 
> Regarding a Jira ticket, generating an exception when it should
>>> produce
> valid 

JDBC and Hint Strategies

2022-10-07 Thread 42-0x2a
Hello

I'm developing an adapter for my custom data source. My adapter uses hints to 
tailor query calls to my underlying data source. Did I understand it correctly 
that hints have to be defined "outside" of an adapter, i.e., at the location 
where the connection is defined?

I'm currently wrapping a JDBC connection with a hook for hint strategies. Is 
that the correct way to do that? It looks quite odd and wrong, especially since 
hooks should only be used for debugging and testing according to the 
documentation. I would rather expect hint strategies to be defined in my 
adapter directly (or the model.json or as a property along with the connection).

Thanks for your help!

try (Hook.Closeable closeable = 
Hook.SQL2REL_CONVERTER_CONFIG_BUILDER.addThread((Holder
 configHolder) -> {
HintStrategyTable strategies = HintStrategyTable.builder()
.hintStrategy("index", (hint, rel) -> true)
.build();
configHolder.accept(config -> config.withHintStrategyTable(strategies));
})) {
String sql = """
SELECT *
FROM test /*+ index(A) */""";

final Properties info = new Properties();
info.put("model", 
requireNonNull(getClass().getResource("/model.json")).getPath());

final Connection connection = DriverManager.getConnection("jdbc:calcite:", 
info);

final Statement statement = connection.createStatement();
final ResultSet rs = statement.executeQuery(sql);

...
}



Re: Pushing filters with unsafe operations (CAST/DIVISION) below inner joins

2022-10-07 Thread Stamatis Zampetakis
I just logged https://issues.apache.org/jira/browse/CALCITE-5315 but I am
not planning to work on this sometime soon.

Best,
Stamatis

On Fri, Feb 18, 2022 at 10:56 PM Stamatis Zampetakis 
wrote:

> Thank you all for the feedback.
>
> I will do a small research about the proper term to use for functions that
> might throw / never throw and log a JIRA to move the discussion further.
>
> @Viliam : Note that the problem you mention, although similar, is not
> exactly the same as the one I brought up now.
>
> The reordering of predicates in the WHERE clause has been brought up quite
> a few times in the dev list and our stance [1] is that since the standard
> leaves this decision to the implementor people should not rely on this (or
> put the appropriate guards).
>
> The case here is a bit different, at least the way I read the standard,
> cause it defines the following:
>
> "If all optional clauses are omitted, then the result of the  expression> is the same as the result of the
> . Otherwise, each specified clause is applied to the result
> of the previously specified clause
> and the result of the  is the result of the application
> of the last specified clause."
>
> and one of the optional clauses mentioned in the previous paragraph is the
> . There seems to be a clearly defined order between the  clause>, which includes inner joins, and the .
>
> Best,
> Stamatis
>
> [1] https://lists.apache.org/thread/mq44cnrohz19hh10btms126vbcoxl50w
>
> On Fri, Feb 18, 2022 at 9:58 AM Viliam Durina 
> wrote:
>
>> I have observed this issue years ago in well-known databases. My case was
>> much simpler:
>>
>> data
>> recordType:int  value:text
>> -   --
>> 0   1
>> 1   a
>>
>> SELECT *
>> FROM data
>> WHERE recordType='1' AND CAST(value AS INT)<10
>>
>>
>> SQL is declarative, and unlike procedural languages, it doesn't prescribe
>> short-circuit evaluation of the WHERE clause, or any specific evaluation
>> order. If it was prescribed, the query would be perfectly safe. But
>> prescribing the evaluation order would rule out many optimizations, or
>> make
>> them much harder, such as this half-null-half-error value.
>>
>> For example, reordering additions might or might not lead to overflow:
>> TINYINTs `100 + 100 - 90`, evaluated in this order, overflow, but `100 -
>> 90
>> + 100` don't - imagine each value comes from a different table and we
>> reorder the joins. Perhaps result of TINYINT addition can be SMALLINT, but
>> what if they are BIGINTs?
>>
>> My understanding was that any expression must not fail in any plausible
>> execution plan. Therefore if I parse the column `value`, it must succeed
>> for every possible scanned row. In my specific case I ended up
>> implementing
>> a custom TO_NUMBER function that returns null on parsing error, and that
>> null never appeared in the result because of the other condition.
>>
>> Viliam
>>
>> On Thu, 17 Feb 2022 at 20:08, Julian Hyde  wrote:
>>
>> > You should definitely log a bug, even if there are no plans to fix it.
>> It
>> > is a violation of the standard, and therefore it is a bug, and
>> therefore we
>> > should document that it exists.
>> >
>> > Can you do some research on the right terminology. You use the term
>> > ‘unsafe’. I think the PL community uses the term ’strict’ for
>> expressions
>> > that must be evaluated (and therefore if they throw, the query is sure
>> to
>> > abort). We have other related concepts floating around in Calcite -
>> dynamic
>> > functions, deterministic operators - and standardizing terminology is
>> key
>> > if we are to keep the menagerie in order.
>> >
>> > There might be a way to have our cake and eat it too. We could push down
>> > the division, and if there is a division-by-zero we return a special
>> value
>> > such as NULL. If the special value is not eliminated by the join then
>> it is
>> > promoted to a throw. The “cake” here is the performance benefit of
>> pushing
>> > down a filter (that may never or rarely throw); the “eating it” is
>> safety
>> > on the occasion that the filter does throw.
>> >
>> > Even if that strategy doesn’t work, maybe we could have a flag that says
>> > whether to push down conditions that might throw.
>> >
>> > Julian
>> >
>> >
>> > > On Feb 17, 2022, at 8:07 AM, Scott Reynolds 
>> > wrote:
>> > >
>> > > Is it feasible to prevent the filter push down in cases like this (
>> > detect
>> > > potential division by zero) ? What other exceptions that could happen?
>> > >
>> > > Should it only push down some filters and preserve the complex filter
>> > above
>> > > the join?
>> > >
>> > > Regarding a Jira ticket, generating an exception when it should
>> produce
>> > > valid results is gotta be a Jira case IMHO.
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Feb 17, 2022, 7:17 AM Stamatis Zampetakis 
>> > wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >> Till today, I believed that pushing a filter below an inner join is
>> not
>> > >> strictly related to the operators 

[jira] [Created] (CALCITE-5315) Error when pushing filters with operations that may throw (CAST/DIVISION) below joins

2022-10-07 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-5315:


 Summary: Error when pushing filters with operations that may throw 
(CAST/DIVISION) below joins
 Key: CALCITE-5315
 URL: https://issues.apache.org/jira/browse/CALCITE-5315
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.32.0
Reporter: Stamatis Zampetakis


+Steps to reproduce:+

{code:sql}
CREATE TABLE emp (empno INT, name VARCHAR, deptno INT);
INSERT INTO emp VALUES (0, 'Alex', 0);
INSERT INTO emp VALUES (10, 'Bob', 1);

CREATE TABLE dept (deptno INT);
INSERT INTO dept VALUES (1);

SELECT e.name
FROM emp e
INNER JOIN dept d ON e.deptno = d.deptno
WHERE (10 / e.empno) = 1
{code}

*Expected output:*
Bob

*Actual output:*
ERROR:  division by zero

The error is caused when the filter condition in the WHERE clause is evaluated 
before the join. Filter push-down is a very common and powerful
optimization but when there are operators in the WHERE clause that may throw
(such as division, cast, etc) this optimization is unsafe and can lead to 
runtime errors.

The SQL standard (Section 7.4 general rule 1) mandates that WHERE should be 
applied to the result of FROM so in the case above pushing filters below the 
join seems to violate the standard.

+Citing the standard:+
"If all optional clauses are omitted, then the result of the  is the same as the result of the
. Otherwise, each specified clause is applied to the result of
the previously specified clause
and the result of the  is the result of the application of
the last specified clause."

One of the optional clauses mentioned in the previous paragraph is the
. There seems to be a clearly defined order between the , which includes inner joins, and the .

Note that this problem is *not* the same as the evaluation order of predicates 
in the WHERE clause, which is implementation specific. This is about evaluation 
order of WHERE clause and FROM clause that is not implementation specific.

Original discussion: 
https://lists.apache.org/thread/cp7h28k1yfxv421q12y1wopbwgrzdzrx



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5314) Prune empty parts of a query by exploiting stats/metadata

2022-10-07 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-5314:


 Summary: Prune empty parts of a query by exploiting stats/metadata
 Key: CALCITE-5314
 URL: https://issues.apache.org/jira/browse/CALCITE-5314
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Stamatis Zampetakis


Currently there is a collection of rules (i.e., 
[PruneEmptyRules|https://github.com/apache/calcite/blob/4ef9ffe0d4afb80fe95e66d4d9c9e6f5939f70a8/core/src/main/java/org/apache/calcite/rel/rules/PruneEmptyRules.java])
 which remove sections of a query plan when it does not produce any rows.

At the moment, the removal requires an empty {{Values}} operator to be in the 
plan in order to take effect. However, there are cases where queries involve 
empty relations/tables and in that case we could remove parts of the plan as 
well.

The information if a relation is empty or not can be derived from metadata. For 
example, we could check if the  
[RelMdMaxRowCount|https://github.com/apache/calcite/blob/4ef9ffe0d4afb80fe95e66d4d9c9e6f5939f70a8/core/src/main/java/org/apache/calcite/rel/metadata/RelMdMaxRowCount.java]
 returns zero to derive that the relation is empty.

Implementation wise there are various alternatives to consider:
 * modify the existing PruneEmptyRule to consider metadata;
 * add new pruning rules relying exclusively on metadata;
 * add new rules checking metadata and turning the empty relation to an empty 
{{Values}} which can then be handled by existing rules




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] restrict JDK test matrix

2022-10-07 Thread Alessandro Solimando
Hello everyone,
I have prepared a PR, if anyone feels like taking a look at it, it's here:
https://github.com/apache/calcite/pull/2928

If there are no feedback/objections I will merge it early next week.

Thanks again for sharing your thoughts around this!

Best regards,
Alessandro

On Tue, 4 Oct 2022 at 12:04, Alessandro Solimando <
alessandro.solima...@gmail.com> wrote:

> Julian, Stamatis,
> thanks for your input!
>
> Since it seems that there is consensus around the topic, I have logged
> CALCITE-5306 .
>
> As soon as I have a PR ready, I will reply to this thread too, in order to
> collect opinions and feedback.
>
> Best regards,
> Alessandro
>
> On Tue, 4 Oct 2022 at 11:12, Stamatis Zampetakis 
> wrote:
>
>> Hello,
>>
>> It is not recommended to use EOL software so dropping those JDK versions
>> from the test matrix makes sense. Anyways it wouldn't be surprising if
>> Jenkins, Travis, Github Actions, etc., remove those EOL versions as well
>> at
>> some point.
>>
>> Best,
>> Stamatis
>>
>> On Mon, Oct 3, 2022 at 3:15 PM Julian Hyde 
>> wrote:
>>
>> > It makes sense to only test on 8, 11, 17 and the latest. Testing on
>> other
>> > versions is going to waste time checking on false negatives. I don’t
>> > remember whether there’s ever been an issue on, say, 15, that wasn’t
>> also
>> > present in 11 or 17.
>> >
>> > Maybe it’s a distinction without a difference, but I think we should
>> still
>> > support the full range of JDK versions.  If I submit a change that
>> breaks
>> > the build on JDK 13, you should tell me and I should fix it. I don’t use
>> > sdkman and can create a JDK 13 environment easily enough from the JDK’s
>> > binary tarball.
>> >
>> > Julian
>> >
>> > > On Oct 3, 2022, at 5:38 AM, Alessandro Solimando <
>> > alessandro.solima...@gmail.com> wrote:
>> > >
>> > > Hello everyone,
>> > > I was checking a build failure
>> > > 
>> > related to
>> > > JDK15 and I wanted to try it locally, however I can't do it via sdkman
>> > >  (a "multi-platform software manager") as JDK is
>> not
>> > > anymore available. This is not the first time, and it makes review
>> tasks
>> > > complicated sometimes (in this specific case it seems an ENV issue,
>> but
>> > > that's not the point here).
>> > >
>> > > I wanted to discuss with you if we really want to keep those "recent
>> but
>> > > EOL" versions or not in our test matrix.
>> > >
>> > > I know that JDK8 is EOL too, but lots of projects are still based on
>> it
>> > and
>> > > it's sadly running in PROD in many places for the same reason. In my
>> > (maybe
>> > > limited) experience, those who upgraded to newer versions (> 11),
>> aren't
>> > > likely to get stuck at, say, 15 and can't move to 17. Is my assumption
>> > > correct in your experience?
>> > >
>> > > In my sdkman on MacOS I only see JDK 8, 11, 17, 20, 21, 22, and I
>> > strongly
>> > > suspect they are following some criteria based on LTS/EOL versions.
>> > >
>> > > Shall we try to do something similar for Calcite and remove
>> non-LTS+EOL
>> > > versions higher than 11?
>> > >
>> > > Best regards,
>> > > Alessandro
>> >
>>
>


[jira] [Created] (CALCITE-5313) Release Avatica-Go 5.2.0

2022-10-07 Thread Francis Chuang (Jira)
Francis Chuang created CALCITE-5313:
---

 Summary: Release Avatica-Go 5.2.0
 Key: CALCITE-5313
 URL: https://issues.apache.org/jira/browse/CALCITE-5313
 Project: Calcite
  Issue Type: Task
  Components: avatica-go
Affects Versions: avatica-go-5.2.0
Reporter: Francis Chuang
Assignee: Francis Chuang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Towards Avatica-Go 5.2.0

2022-10-07 Thread Francis Chuang

Hey everyone,

I think it's a good time to release Avatica-Go 5.2.0 as 5.1.0 was 
released around 7 months ago. This should be a straight-forward release 
and I'll be making rc0 available for voting early next week.


If you have any questions or would like to get a certain change in the 
release, please let me know.


Francis


[jira] [Created] (CALCITE-5312) Replace http digest auth library

2022-10-07 Thread Francis Chuang (Jira)
Francis Chuang created CALCITE-5312:
---

 Summary: Replace http digest auth library
 Key: CALCITE-5312
 URL: https://issues.apache.org/jira/browse/CALCITE-5312
 Project: Calcite
  Issue Type: New Feature
  Components: avatica-go
Reporter: Francis Chuang
Assignee: Francis Chuang
 Fix For: avatica-go-5.2.0


We currently use github.com/xinsnake/go-http-digest-auth-client, but the 
library has been archived and is no longer maintained. github.com/icholy/digest 
is a good replacement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)