Thanks for logging this. I agree that it’s not high priority. I added some commentary about related bugs.
> On Oct 7, 2022, at 9:07 AM, Stamatis Zampetakis <zabe...@gmail.com> wrote: > > I just logged https://issues.apache.org/jira/browse/CALCITE-5315 but I am > not planning to work on this sometime soon. > > Best, > Stamatis > > On Fri, Feb 18, 2022 at 10:56 PM Stamatis Zampetakis <zabe...@gmail.com> > wrote: > >> Thank you all for the feedback. >> >> I will do a small research about the proper term to use for functions that >> might throw / never throw and log a JIRA to move the discussion further. >> >> @Viliam : Note that the problem you mention, although similar, is not >> exactly the same as the one I brought up now. >> >> The reordering of predicates in the WHERE clause has been brought up quite >> a few times in the dev list and our stance [1] is that since the standard >> leaves this decision to the implementor people should not rely on this (or >> put the appropriate guards). >> >> The case here is a bit different, at least the way I read the standard, >> cause it defines the following: >> >> "If all optional clauses are omitted, then the result of the <table >> expression> is the same as the result of the >> <from clause>. Otherwise, each specified clause is applied to the result >> of the previously specified clause >> and the result of the <table expression> is the result of the application >> of the last specified clause." >> >> and one of the optional clauses mentioned in the previous paragraph is the >> <where clause>. There seems to be a clearly defined order between the <from >> clause>, which includes inner joins, and the <where clause>. >> >> Best, >> Stamatis >> >> [1] https://lists.apache.org/thread/mq44cnrohz19hh10btms126vbcoxl50w >> >> On Fri, Feb 18, 2022 at 9:58 AM Viliam Durina <vil...@hazelcast.com> >> wrote: >> >>> I have observed this issue years ago in well-known databases. My case was >>> much simpler: >>> >>> data >>> recordType:int value:text >>> ----- ------ >>> 0 1 >>> 1 a >>> >>> SELECT * >>> FROM data >>> WHERE recordType='1' AND CAST(value AS INT)<10 >>> >>> >>> SQL is declarative, and unlike procedural languages, it doesn't prescribe >>> short-circuit evaluation of the WHERE clause, or any specific evaluation >>> order. If it was prescribed, the query would be perfectly safe. But >>> prescribing the evaluation order would rule out many optimizations, or >>> make >>> them much harder, such as this half-null-half-error value. >>> >>> For example, reordering additions might or might not lead to overflow: >>> TINYINTs `100 + 100 - 90`, evaluated in this order, overflow, but `100 - >>> 90 >>> + 100` don't - imagine each value comes from a different table and we >>> reorder the joins. Perhaps result of TINYINT addition can be SMALLINT, but >>> what if they are BIGINTs? >>> >>> My understanding was that any expression must not fail in any plausible >>> execution plan. Therefore if I parse the column `value`, it must succeed >>> for every possible scanned row. In my specific case I ended up >>> implementing >>> a custom TO_NUMBER function that returns null on parsing error, and that >>> null never appeared in the result because of the other condition. >>> >>> Viliam >>> >>> On Thu, 17 Feb 2022 at 20:08, Julian Hyde <jhyde.apa...@gmail.com> wrote: >>> >>>> You should definitely log a bug, even if there are no plans to fix it. >>> It >>>> is a violation of the standard, and therefore it is a bug, and >>> therefore we >>>> should document that it exists. >>>> >>>> Can you do some research on the right terminology. You use the term >>>> ‘unsafe’. I think the PL community uses the term ’strict’ for >>> expressions >>>> that must be evaluated (and therefore if they throw, the query is sure >>> to >>>> abort). We have other related concepts floating around in Calcite - >>> dynamic >>>> functions, deterministic operators - and standardizing terminology is >>> key >>>> if we are to keep the menagerie in order. >>>> >>>> There might be a way to have our cake and eat it too. We could push down >>>> the division, and if there is a division-by-zero we return a special >>> value >>>> such as NULL. If the special value is not eliminated by the join then >>> it is >>>> promoted to a throw. The “cake” here is the performance benefit of >>> pushing >>>> down a filter (that may never or rarely throw); the “eating it” is >>> safety >>>> on the occasion that the filter does throw. >>>> >>>> Even if that strategy doesn’t work, maybe we could have a flag that says >>>> whether to push down conditions that might throw. >>>> >>>> Julian >>>> >>>> >>>>> On Feb 17, 2022, at 8:07 AM, Scott Reynolds <sdrreyno...@gmail.com> >>>> wrote: >>>>> >>>>> Is it feasible to prevent the filter push down in cases like this ( >>>> detect >>>>> potential division by zero) ? What other exceptions that could happen? >>>>> >>>>> Should it only push down some filters and preserve the complex filter >>>> above >>>>> the join? >>>>> >>>>> Regarding a Jira ticket, generating an exception when it should >>> produce >>>>> valid results is gotta be a Jira case IMHO. >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Feb 17, 2022, 7:17 AM Stamatis Zampetakis <zabe...@gmail.com> >>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Till today, I believed that pushing a filter below an inner join is >>> not >>>>>> strictly related to the operators (equals, plus, minus, cast, >>> division) >>>>>> present in the filter. >>>>>> >>>>>> However, the query below will return some results if the filter is >>>> executed >>>>>> after the join or it will fail with an exception if the filter is >>> pushed >>>>>> below the join. >>>>>> >>>>>> EMP [EMPNO, NAME, DEPTNO] >>>>>> 0 | Alex | 0 >>>>>> 10 | Bob | 1 >>>>>> >>>>>> DEP [DEPTNO] >>>>>> 1 >>>>>> >>>>>> SELECT e.name >>>>>> FROM emp e >>>>>> INNER JOIN dep d ON e.deptno = d.deptno >>>>>> WHERE (10 / e.empno) = 1 >>>>>> >>>>>> It seems that SQL standard (Section 7.4 general rule 1) mandates that >>>> WHERE >>>>>> should be applied to the result of FROM so in the case above pushing >>>>>> filters below the join seems to violate the standard. >>>>>> >>>>>> Currently, in Calcite we are going to push the division, cast, and >>>>>> basically any kind of operator below the join. Many well-known >>> databases >>>>>> e.g., Postgres, do that as well. >>>>>> >>>>>> Should we log it as a bug and try to do something about it or let it >>> be >>>> as >>>>>> it is? >>>>>> >>>>>> Best, >>>>>> Stamatis >>>>>> >>>> >>>> >>> >>> -- >>> This message contains confidential information and is intended only for >>> the >>> individuals named. If you are not the named addressee you should not >>> disseminate, distribute or copy this e-mail. Please notify the sender >>> immediately by e-mail if you have received this e-mail by mistake and >>> delete this e-mail from your system. E-mail transmission cannot be >>> guaranteed to be secure or error-free as information could be >>> intercepted, >>> corrupted, lost, destroyed, arrive late or incomplete, or contain >>> viruses. >>> The sender therefore does not accept liability for any errors or >>> omissions >>> in the contents of this message, which arise as a result of e-mail >>> transmission. If verification is required, please request a hard-copy >>> version. -Hazelcast >>> >>