[jira] [Commented] (CALCITE-5884) Type Inference rule for ARRAY_TO_STRING is incorrect

Mihai Budiu (Jira) Thu, 03 Aug 2023 09:12:05 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750815#comment-17750815
 ]


Mihai Budiu commented on CALCITE-5884:
--------------------------------------

Alright. Then I will close my existing PR and submit a new one which documents 
this behavior. There is no documentation or tests about this, and I assumed 
that this is the same as the Postgres definition.

I will also document the implementation of the arrayToString function, which 
looks like this:

{code:java}
  public static @Nullable String arrayToString(@Nullable List list, @Nullable 
String delimiter) {
    return arrayToString(list, delimiter, null);
  }
{code}

By reading this code you may think that 'null' *is* a legal third argument. 

[~asolimando]: two comments:
- the ticket was created when I thought that typing is incorrect. I discovered 
more issues while trying to implement the Postgres behavior of this function.
- I think this is actually a great example why the nullPolicy is extremely 
subtle. *The code you write in the implementation of a function is not the code 
that is run to evaluate the function during constant folding*. The function 
evaluation can be described roughly like this (but it is even more subtle in 
fact): the nullPolicy is evaluated first and can decide when the result is 
null. The Java function is evaluated afterwards, if it is still necessary. 

I think this is a dangerous pattern, because it is very difficult to understand 
the semantics of this layered execution. 

Moreover, the information about when a function may produce a null result is 
encoded in *3* different places:
- the function result type inference
- the nullPolicy
- the implementation itself

There is also no consistency check anywhere that these 3 do not give 
contradictory results. You can easily write a type inference rule that returns 
NULL when the second argument is NULL but a nullPolicy that doesn't.

(This is from the perspective of the RexToLixTranslator. Maybe there are other 
uses of the nullPolicy that I don't know about.)

I think that the nullPolicy could probably be removed, at least for this 
purpose, and this would make the code simpler to understand and debug. But it 
may require auditing the implementation of all the functions defined, some of 
which, like arrayToString above, do *not* implement the correct behavior 
without a nullPolicy, so it is not a small task.


> Type Inference rule for ARRAY_TO_STRING is incorrect
> ----------------------------------------------------
>
>                 Key: CALCITE-5884
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5884
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.35.0
>            Reporter: Mihai Budiu
>            Priority: Trivial
>              Labels: pull-request-available
>
> This is the current definition of the function ARRAY_TO_STRING in 
> SqlLibraryOperators:
> {code:java}
>   /** The "ARRAY_TO_STRING(array, delimiter [, nullText ])" function. */
>   @LibraryOperator(libraries = {BIG_QUERY})
>   public static final SqlFunction ARRAY_TO_STRING =
>       SqlBasicFunction.create(SqlKind.ARRAY_TO_STRING,
>           ReturnTypes.VARCHAR_NULLABLE,
>           OperandTypes.STRING_ARRAY_CHARACTER_OPTIONAL_CHARACTER);
> {code}
> So the result is nullable if any of the arguments is nullable. However, the 
> nullability of the last argument does not influence the result nullabillity: 
> a NULL value for the third optional argument will not cause a NULL value to 
> be output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5884) Type Inference rule for ARRAY_TO_STRING is incorrect

Reply via email to