Some conversions are widening. For example, if I convert a SMALLINT to an INTEGER, there’s no possibility of data loss because every valid SMALLINT value has a corresponding INTEGER value.
The existence of widening conversions means that means that we can get by with fewer functions and operators. For example, there is no need to write a SUBSTRING function that takes SMALLINT arguments, because the version that takes INTEGER arguments works fine. If we need to draw a line somewhere, it might make sense to treat widening conversions differently. Julian > On Mar 10, 2025, at 10:01 AM, Alessandro Solimando > <[email protected]> wrote: > > For the custom equality I'd expect a few concrete implementations out of > all the cartesian product of all data types, that's why I was suggesting to > make the typing explicit with some custom functions with the right > overloaded signatures. > > What I wanted to say is that I agree that we might be overdoing with > syntactical transformations, but I am not sure we can or should avoid > supporting implicit casts. > > On Mon, 10 Mar 2025 at 17:19, Mihai Budiu <[email protected]> wrote: > >> If you have 20 different base types and you allow any pairwise >> comparisons, then you have to consider 400 functions just for equality, >> which decide what happens for every pair of types. Since Calcite supports >> parameterized types (e.g., TIME(3)) and generic types (like arrays or >> maps), the number of types supported is really unbounded. >> >> Coercion drastically reduces the complexity of the implementation. But >> indeed, coercion is not necessarily required; it is just part of a hygienic >> language design. >> >> Calcite is statically typed; other SQL dialects, such as sqlite are >> dynamically-typed, where the type of an expression is sometimes decided at >> runtime, based on the value of operands. >> >> Mihai >> >> ________________________________ >> From: Alessandro Solimando <[email protected]> >> Sent: Monday, March 10, 2025 9:03 AM >> To: [email protected] <[email protected]> >> Subject: Re: Questions Regarding Type Coercion and CAST Transformations in >> SqlValidator and RexBuilder >> >> The SQL standard hints into implicit casts into several places but it is >> not 100% explicit about it (no pun intended). >> >> For instance, "4.11 Data conversions" in Part 2 (2011 edition) states: >> >>> Implicit type conversion can occur in expressions, fetch operations, >>> single row select operations, inserts, deletes, >>> and updates. >> >> Explicit type conversions can be specified by the use of the CAST operator. >>> A CAST operator defines how values of a source data type are converted >>> into a value of a target data type according to the Syntax Rules and >> General >>> Rules of Subclause 6.13, “<cast specification>”. >> >> >> If my understanding is correct, if you provide an ambiguous expression, >> it's to be expected that the validator will have to resolve it to >> accomplish its task, and I agree with Mihai on that part. >> >> We have explicit CAST to force something different than what the validator >> would do by itself, and that's generally enough. >> >> For a system having an EQUALS function accepting parameters of different >> types, I'd expect it to be a separate function from the "standard" one, >> maybe registered as a user-defined function, so again there would be no >> ambiguity at validation time? >> >> Concerning the transformations like translating to CASE, I agree that it >> should be configurable. >> >> I think that historically we have those transformations in place to >> restrict the syntactical variants on expressions, as we are finally >> manipulating and matching expressions syntactically other than semantically >> (similar considerations arise often for the IN -> OR translation and >> similar), but it's true that for cases like that of Matthew we might be too >> constraining. >> >> Best regards, >> Alessandro >> >> On Sat, 8 Mar 2025 at 20:08, Steven Phillips <[email protected]> >> wrote: >> >>> I think it's reasonable for a system to not want these transformations. A >>> given execution engine could choose to implement an EQUALS function >>> between the two different types, rather than using implicit cast. Also, >> for >>> use cases like Matthew's, it seems reasonable that we could configure >>> SqlValidator and SqlToRel to not do any type coercion. >>> >>> On Sat, Mar 8, 2025 at 10:18 AM Mihai Budiu <[email protected]> wrote: >>> >>>> Yes, these transformations are essential for compiling programs, >> because >>>> they make the program's meaning unambiguous. Consider your example >> where >>>> you check equality between two columns of different types A and B. The >>>> meaning of the program is very different depending on what casts are >>>> inserted: >>>> >>>> a = b >>>> >>>> Can be interpreted as >>>> >>>> (B)a = b >>>> >>>> or as >>>> >>>> a = (A)b >>>> >>>> These will give different results in some cases. The Validator makes it >>>> clear what the choice of the compiler is. >>>> >>>> Mihai >>>> >>>> ________________________________ >>>> From: Matthew McMillian <[email protected]> >>>> Sent: Friday, March 7, 2025 10:34 PM >>>> To: [email protected] <[email protected]> >>>> Subject: Questions Regarding Type Coercion and CAST Transformations in >>>> SqlValidator and RexBuilder >>>> >>>> Hi Calcite community, >>>> >>>> I'm using Calcite to analyze SQL queries and have encountered some >> issues >>>> with transformations that interfere with my use case. I hope the >>> community >>>> can provide guidance. >>>> >>>> My current workflow: >>>> >>>> 1. Parse the query into a SqlNode. >>>> 2. Validate the SqlNode using SqlValidator (callRewrite=false), >>> applying >>>> type coercion and other transformations. >>>> 3. Walk the validated SqlNode to perform custom validation, using >>>> SqlValidator and SqlValidatorScope for column resolution. >>>> 4. Convert the SqlNode to a RelNode. >>>> 5. Walk the RelNode for further custom validation. >>>> >>>> For my use case, I care deeply about the original syntax of the query. >> I >>>> need to preserve the original query structure rather than an equivalent >>>> transformation. Steps #2 and #4 introduce changes that create >> challenges. >>>> For example, a comparison between two columns of different types. >>>> >>>> - <bool_column> = <int_column> is rewritten in step #2 as >>>> CAST(<bool_column> >>>> AS INT) = <int_column> (AbstractTypeCoercion >>>> < >>>> >>> >> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L157 >>>>> >>>> ). >>>> - In step #4, it further transforms into CASE(IS NOT >>>> NULL(<bool_column>), CASE(<bool_column>, 1, 0), null) (RexBuilder >>>> < >>>> >>> >> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L940 >>>>> >>>> ). >>>> >>>> I have two questions: >>>> >>>> 1. Are the aforementioned types of transformations essential for >>>> Calcite to compile queries? Is there a way to disable them while >> still >>>> compiling queries? >>>> 2. Is my current approach reasonable, or is there a better way to >>>> achieve my goal within Calcite? >>>> >>>> >>>> Any insights would be greatly appreciated. >>>> >>>> Thanks, >>>> Matthew McMillian >>>> >>> >>
