Some conversions are widening. For example, if I convert a SMALLINT to an 
INTEGER, there’s no possibility of data loss because every valid SMALLINT value 
has a corresponding INTEGER value.

The existence of widening conversions means that means that we can get by with 
fewer functions and operators. For example, there is no need to write a 
SUBSTRING function that takes SMALLINT arguments, because the version that 
takes INTEGER arguments works fine.

If we need to draw a line somewhere, it might make sense to treat widening 
conversions differently.

Julian


> On Mar 10, 2025, at 10:01 AM, Alessandro Solimando 
> <[email protected]> wrote:
> 
> For the custom equality I'd expect a few concrete implementations out of
> all the cartesian product of all data types, that's why I was suggesting to
> make the typing explicit with some custom functions with the right
> overloaded signatures.
> 
> What I wanted to say is that I agree that we might be overdoing with
> syntactical transformations, but I am not sure we can or should avoid
> supporting implicit casts.
> 
> On Mon, 10 Mar 2025 at 17:19, Mihai Budiu <[email protected]> wrote:
> 
>> If you have 20 different base types and you allow any pairwise
>> comparisons, then you have to consider 400 functions just for equality,
>> which decide what happens for every pair of types. Since Calcite supports
>> parameterized types (e.g., TIME(3)) and generic types (like arrays or
>> maps), the number of types supported is really unbounded.
>> 
>> Coercion drastically reduces the complexity of the implementation. But
>> indeed, coercion is not necessarily required; it is just part of a hygienic
>> language design.
>> 
>> Calcite is statically typed; other SQL dialects, such as sqlite are
>> dynamically-typed, where the type of an expression is sometimes decided at
>> runtime, based on the value of operands.
>> 
>> Mihai
>> 
>> ________________________________
>> From: Alessandro Solimando <[email protected]>
>> Sent: Monday, March 10, 2025 9:03 AM
>> To: [email protected] <[email protected]>
>> Subject: Re: Questions Regarding Type Coercion and CAST Transformations in
>> SqlValidator and RexBuilder
>> 
>> The SQL standard hints into implicit casts into several places but it is
>> not 100% explicit about it (no pun intended).
>> 
>> For instance, "4.11 Data conversions" in Part 2 (2011 edition) states:
>> 
>>> Implicit type conversion can occur in expressions, fetch operations,
>>> single row select operations, inserts, deletes,
>>> and updates.
>> 
>> Explicit type conversions can be specified by the use of the CAST operator.
>>> A CAST operator defines how values of a source data type are converted
>>> into a value of a target data type according to the Syntax Rules and
>> General
>>> Rules of Subclause 6.13, “<cast specification>”.
>> 
>> 
>> If my understanding is correct, if you provide an ambiguous expression,
>> it's to be expected that the validator will have to resolve it to
>> accomplish its task, and I agree with Mihai on that part.
>> 
>> We have explicit CAST to force something different than what the validator
>> would do by itself, and that's generally enough.
>> 
>> For a system having an EQUALS function accepting parameters of different
>> types, I'd expect it to be a separate function from the "standard" one,
>> maybe registered as a user-defined function, so again there would be no
>> ambiguity at validation time?
>> 
>> Concerning the transformations like translating to CASE, I agree that it
>> should be configurable.
>> 
>> I think that historically we have those transformations in place to
>> restrict the syntactical variants on expressions, as we are finally
>> manipulating and matching expressions syntactically other than semantically
>> (similar considerations arise often for the IN -> OR translation and
>> similar), but it's true that for cases like that of Matthew we might be too
>> constraining.
>> 
>> Best regards,
>> Alessandro
>> 
>> On Sat, 8 Mar 2025 at 20:08, Steven Phillips <[email protected]>
>> wrote:
>> 
>>> I think it's reasonable for a system to not want these transformations. A
>>> given execution engine could choose to implement  an EQUALS function
>>> between the two different types, rather than using implicit cast. Also,
>> for
>>> use cases like Matthew's, it seems reasonable that we could configure
>>> SqlValidator and SqlToRel to not do any type coercion.
>>> 
>>> On Sat, Mar 8, 2025 at 10:18 AM Mihai Budiu <[email protected]> wrote:
>>> 
>>>> Yes, these transformations are essential for compiling programs,
>> because
>>>> they make the program's meaning unambiguous. Consider your example
>> where
>>>> you check equality between two columns of different types A and B.  The
>>>> meaning of the program is very different depending on what casts are
>>>> inserted:
>>>> 
>>>> a = b
>>>> 
>>>> Can be interpreted as
>>>> 
>>>> (B)a = b
>>>> 
>>>> or as
>>>> 
>>>> a = (A)b
>>>> 
>>>> These will give different results in some cases. The Validator makes it
>>>> clear what the choice of the compiler is.
>>>> 
>>>> Mihai
>>>> 
>>>> ________________________________
>>>> From: Matthew McMillian <[email protected]>
>>>> Sent: Friday, March 7, 2025 10:34 PM
>>>> To: [email protected] <[email protected]>
>>>> Subject: Questions Regarding Type Coercion and CAST Transformations in
>>>> SqlValidator and RexBuilder
>>>> 
>>>> Hi Calcite community,
>>>> 
>>>> I'm using Calcite to analyze SQL queries and have encountered some
>> issues
>>>> with transformations that interfere with my use case. I hope the
>>> community
>>>> can provide guidance.
>>>> 
>>>> My current workflow:
>>>> 
>>>>   1. Parse the query into a SqlNode.
>>>>   2. Validate the SqlNode using SqlValidator (callRewrite=false),
>>> applying
>>>>   type coercion and other transformations.
>>>>   3. Walk the validated SqlNode to perform custom validation, using
>>>>   SqlValidator and SqlValidatorScope for column resolution.
>>>>   4. Convert the SqlNode to a RelNode.
>>>>   5. Walk the RelNode for further custom validation.
>>>> 
>>>> For my use case, I care deeply about the original syntax of the query.
>> I
>>>> need to preserve the original query structure rather than an equivalent
>>>> transformation. Steps #2 and #4 introduce changes that create
>> challenges.
>>>> For example, a comparison between two columns of different types.
>>>> 
>>>>   - <bool_column> = <int_column> is rewritten in step #2 as
>>>> CAST(<bool_column>
>>>>   AS INT) = <int_column> (AbstractTypeCoercion
>>>>   <
>>>> 
>>> 
>> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L157
>>>>> 
>>>>   ).
>>>>   - In step #4, it further transforms into CASE(IS NOT
>>>>   NULL(<bool_column>), CASE(<bool_column>, 1, 0), null) (RexBuilder
>>>>   <
>>>> 
>>> 
>> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L940
>>>>> 
>>>>   ).
>>>> 
>>>> I have two questions:
>>>> 
>>>>   1. Are the aforementioned types of transformations essential for
>>>>   Calcite to compile queries? Is there a way to disable them while
>> still
>>>>   compiling queries?
>>>>   2. Is my current approach reasonable, or is there a better way to
>>>>   achieve my goal within Calcite?
>>>> 
>>>> 
>>>> Any insights would be greatly appreciated.
>>>> 
>>>> Thanks,
>>>> Matthew McMillian
>>>> 
>>> 
>> 

Reply via email to