I just logged https://issues.apache.org/jira/browse/CALCITE-5315 but I am
not planning to work on this sometime soon.

Best,
Stamatis

On Fri, Feb 18, 2022 at 10:56 PM Stamatis Zampetakis <zabe...@gmail.com>
wrote:

> Thank you all for the feedback.
>
> I will do a small research about the proper term to use for functions that
> might throw / never throw and log a JIRA to move the discussion further.
>
> @Viliam : Note that the problem you mention, although similar, is not
> exactly the same as the one I brought up now.
>
> The reordering of predicates in the WHERE clause has been brought up quite
> a few times in the dev list and our stance [1] is that since the standard
> leaves this decision to the implementor people should not rely on this (or
> put the appropriate guards).
>
> The case here is a bit different, at least the way I read the standard,
> cause it defines the following:
>
> "If all optional clauses are omitted, then the result of the <table
> expression> is the same as the result of the
> <from clause>. Otherwise, each specified clause is applied to the result
> of the previously specified clause
> and the result of the <table expression> is the result of the application
> of the last specified clause."
>
> and one of the optional clauses mentioned in the previous paragraph is the
> <where clause>. There seems to be a clearly defined order between the <from
> clause>, which includes inner joins, and the <where clause>.
>
> Best,
> Stamatis
>
> [1] https://lists.apache.org/thread/mq44cnrohz19hh10btms126vbcoxl50w
>
> On Fri, Feb 18, 2022 at 9:58 AM Viliam Durina <vil...@hazelcast.com>
> wrote:
>
>> I have observed this issue years ago in well-known databases. My case was
>> much simpler:
>>
>> data
>> recordType:int  value:text
>> -----           ------
>> 0               1
>> 1               a
>>
>> SELECT *
>> FROM data
>> WHERE recordType='1' AND CAST(value AS INT)<10
>>
>>
>> SQL is declarative, and unlike procedural languages, it doesn't prescribe
>> short-circuit evaluation of the WHERE clause, or any specific evaluation
>> order. If it was prescribed, the query would be perfectly safe. But
>> prescribing the evaluation order would rule out many optimizations, or
>> make
>> them much harder, such as this half-null-half-error value.
>>
>> For example, reordering additions might or might not lead to overflow:
>> TINYINTs `100 + 100 - 90`, evaluated in this order, overflow, but `100 -
>> 90
>> + 100` don't - imagine each value comes from a different table and we
>> reorder the joins. Perhaps result of TINYINT addition can be SMALLINT, but
>> what if they are BIGINTs?
>>
>> My understanding was that any expression must not fail in any plausible
>> execution plan. Therefore if I parse the column `value`, it must succeed
>> for every possible scanned row. In my specific case I ended up
>> implementing
>> a custom TO_NUMBER function that returns null on parsing error, and that
>> null never appeared in the result because of the other condition.
>>
>> Viliam
>>
>> On Thu, 17 Feb 2022 at 20:08, Julian Hyde <jhyde.apa...@gmail.com> wrote:
>>
>> > You should definitely log a bug, even if there are no plans to fix it.
>> It
>> > is a violation of the standard, and therefore it is a bug, and
>> therefore we
>> > should document that it exists.
>> >
>> > Can you do some research on the right terminology. You use the term
>> > ‘unsafe’. I think the PL community uses the term ’strict’ for
>> expressions
>> > that must be evaluated (and therefore if they throw, the query is sure
>> to
>> > abort). We have other related concepts floating around in Calcite -
>> dynamic
>> > functions, deterministic operators - and standardizing terminology is
>> key
>> > if we are to keep the menagerie in order.
>> >
>> > There might be a way to have our cake and eat it too. We could push down
>> > the division, and if there is a division-by-zero we return a special
>> value
>> > such as NULL. If the special value is not eliminated by the join then
>> it is
>> > promoted to a throw. The “cake” here is the performance benefit of
>> pushing
>> > down a filter (that may never or rarely throw); the “eating it” is
>> safety
>> > on the occasion that the filter does throw.
>> >
>> > Even if that strategy doesn’t work, maybe we could have a flag that says
>> > whether to push down conditions that might throw.
>> >
>> > Julian
>> >
>> >
>> > > On Feb 17, 2022, at 8:07 AM, Scott Reynolds <sdrreyno...@gmail.com>
>> > wrote:
>> > >
>> > > Is it feasible to prevent the filter push down in cases like this (
>> > detect
>> > > potential division by zero) ? What other exceptions that could happen?
>> > >
>> > > Should it only push down some filters and preserve the complex filter
>> > above
>> > > the join?
>> > >
>> > > Regarding a Jira ticket, generating an exception when it should
>> produce
>> > > valid results is gotta be a Jira case IMHO.
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Feb 17, 2022, 7:17 AM Stamatis Zampetakis <zabe...@gmail.com>
>> > wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >> Till today, I believed that pushing a filter below an inner join is
>> not
>> > >> strictly related to the operators (equals, plus, minus, cast,
>> division)
>> > >> present in the filter.
>> > >>
>> > >> However, the query below will return some results if the filter is
>> > executed
>> > >> after the join or it will fail with an exception if the filter is
>> pushed
>> > >> below the join.
>> > >>
>> > >> EMP [EMPNO, NAME, DEPTNO]
>> > >> 0   | Alex | 0
>> > >> 10 | Bob | 1
>> > >>
>> > >> DEP [DEPTNO]
>> > >> 1
>> > >>
>> > >> SELECT e.name
>> > >> FROM emp e
>> > >> INNER JOIN dep d ON e.deptno = d.deptno
>> > >> WHERE (10 / e.empno) = 1
>> > >>
>> > >> It seems that SQL standard (Section 7.4 general rule 1) mandates that
>> > WHERE
>> > >> should be applied to the result of FROM so in the case above pushing
>> > >> filters below the join seems to violate the standard.
>> > >>
>> > >> Currently, in Calcite we are going to push the division, cast, and
>> > >> basically any kind of operator below the join. Many well-known
>> databases
>> > >> e.g., Postgres, do that as well.
>> > >>
>> > >> Should we log it as a bug and try to do something about it or let it
>> be
>> > as
>> > >> it is?
>> > >>
>> > >> Best,
>> > >> Stamatis
>> > >>
>> >
>> >
>>
>> --
>> This message contains confidential information and is intended only for
>> the
>> individuals named. If you are not the named addressee you should not
>> disseminate, distribute or copy this e-mail. Please notify the sender
>> immediately by e-mail if you have received this e-mail by mistake and
>> delete this e-mail from your system. E-mail transmission cannot be
>> guaranteed to be secure or error-free as information could be
>> intercepted,
>> corrupted, lost, destroyed, arrive late or incomplete, or contain
>> viruses.
>> The sender therefore does not accept liability for any errors or
>> omissions
>> in the contents of this message, which arise as a result of e-mail
>> transmission. If verification is required, please request a hard-copy
>> version. -Hazelcast
>>
>

Reply via email to