I think we should start a new thread. Mihai
________________________________ From: Stephen Carlin <scar...@cloudera.com.INVALID> Sent: Tuesday, January 30, 2024 5:19 PM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: Storing RexLiteral as BigDecimal values I don't know if this is hijacking or adding to the discussion, but figured I'd write here. I'm currently implementing Calcite on my database, but I'm actually running into a problem where my db would rather have exact types on some of the lower integers. LIke, if we have the value 38, I need this to be set as a TINYINT, not an INT. I guess Calcite doesn't really care if things are TINYINT or INT because of it treats NUMERIC classes the same, but my database does care. One example is that my "=" function needs both parameters to be TINYINT or INT (or the other integer values), and spits up otherwise. I currently have a crazy type coercion rule to get around this limitation. I know there are existing type coercions built in for the "=" function and others, but it doesn't cover all my cases and it seems too hacky and piecemeal to implement it that way. Would this be something I should add as some kind of Dialect? Or is the Calcite community against these type of additions. On Tue, Jan 30, 2024 at 4:13 PM Mihai Budiu <mbu...@gmail.com> wrote: > Highly relevant discussion: > https://issues.apache.org/jira/browse/CALCITE-2067 > > This change would indeed enable representing other DOUBLE values which > cannot be represented by BigDecimal, such as NaN, Infinities, and -0. It > looks like the evaluator does not reduce such expressions. > > That issue also had a PR, which was rejected at the time, but I think a > new PR would look relatively similar. Julian's objection at the time was > that this changes the SQL language; however, this doesn't change the syntax > of the language, just the representation that the evaluator uses > internally. This would enable changes in the syntax too, but they can be > done separately if desired. > > Mihai > > ________________________________ > From: Mihai Budiu <mbu...@gmail.com> > Sent: Tuesday, January 30, 2024 4:00 PM > To: dev@calcite.apache.org <dev@calcite.apache.org> > Subject: Re: Storing RexLiteral as BigDecimal values > > > * > Regarding [1], it looks like the power function implementations with > DECIMAL arguments return DOUBLE. Where the BigDecimal Java type has custom > functions that have potentially higher accuracy than FP versions, using > them could be beneficial. Related to this, Calcite uses POWER to represent > SQRT, but I noticed that SQRT gives more precise results. However, the Java > implementation of Math.pow is very careful, and seems to be optimized for > the case of integer arguments: > https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#pow-double-double- > > > * This is what POWER looks like in the SqlStdOperatorTable, which > suggests that even the type checker assumes that this function always > returns DOUBLE: > > /** The {@code POWER(numeric, numeric)} function. > * > * <p>The return type is always {@code DOUBLE} since we don't know > * what the result type will be by just looking at the operand types. For > * example {@code POWER(INTEGER, INTEGER)} can return a non-INTEGER if > the > * second operand is negative. > */ > public static final SqlBasicFunction POWER = > SqlBasicFunction.create("POWER", > ReturnTypes.DOUBLE_NULLABLE, > OperandTypes.NUMERIC_NUMERIC, > SqlFunctionCategory.NUMERIC); > > > * Using a string representation for double literals, as you propose, > still introduces rounding problems, when the Double value is converted to a > string. So I think this will just replace one problem with another. > > > * > Finally, regarding the subject of this email, the right solution seems to > allow RexLiteral with a DOUBLE type (and similarly for REAL) to store > values either as BigDecimal, which is suitable when the literals come from > the parser, or as Double, which is suitable when values come from the > evaluator. I plan to file a JIRA issue to enable this feature. > > Thank you, > Mihai > > ________________________________ > From: Julian Hyde <jhyde.apa...@gmail.com> > Sent: Tuesday, January 30, 2024 3:42 PM > To: dev@calcite.apache.org <dev@calcite.apache.org> > Subject: Re: Storing RexLiteral as BigDecimal values > > In the case of POWER, Postgres has both a decimal and double version[1], > and Calcite should do the same. > > However, I take your point that if the function returns a double, then > constant-reduction should create a double literal. > > I am concerned that developers will accidentally create double literals, > when double literals are not possible in SQL (until we implement DOUBLE > ‘3.14’ syntax, as discussed in [2]). Until then, is it better to store the > Rex equivalent of CAST(‘3.14’ AS DOUBLE)? > > Julian > > [1] https://www.postgresql.org/docs/8.2/functions-math.html > [2] https://issues.apache.org/jira/browse/CALCITE-6052 > > > On Jan 30, 2024, at 3:21 PM, Mihai Budiu <mbu...@gmail.com> wrote: > > > > Well, irrespective of the type of the arguments, the POWER function > returns a double value. > > > > These are the 4 implementations in SqlFunctions: > > > > public static double power(double b0, double b1) { > > return Math.pow(b0, b1); > > } > > > > public static double power(double b0, BigDecimal b1) { > > return Math.pow(b0, b1.doubleValue()); > > } > > > > public static double power(BigDecimal b0, double b1) { > > return Math.pow(b0.doubleValue(), b1); > > } > > > > public static double power(BigDecimal b0, BigDecimal b1) { > > return Math.pow(b0.doubleValue(), b1.doubleValue()); > > } > > > > The problem is that, for the query below, the optimizer will then > convert the double result of power into a BigDecimal, rounding it in the > process. > > > > Mihai > > ________________________________ > > From: Julian Hyde <jhyde.apa...@gmail.com> > > Sent: Tuesday, January 30, 2024 3:03 PM > > To: dev@calcite.apache.org <dev@calcite.apache.org> > > Subject: Re: Storing RexLiteral as BigDecimal values > > > > It’s surprising to me that 1004.3e0 should have type DOUBLE or REAL; I > would expect it to have type DECIMAL(5, 1), where it can be represented > exactly. > > > >> On Jan 30, 2024, at 2:53 PM, Mihai Budiu <mbu...@gmail.com> wrote: > >> > >> e0 shows that these are DOUBLE values. > >> Moreover, power returns a DOUBLE value. > >> In FP the result is the wrong one, but that's the semantics of the > power function in FP. > >> > >> Mihai > >> ________________________________ > >> From: Julian Hyde <jhyde.apa...@gmail.com> > >> Sent: Tuesday, January 30, 2024 2:50 PM > >> To: dev@calcite.apache.org <dev@calcite.apache.org> > >> Subject: Re: Storing RexLiteral as BigDecimal values > >> > >> The inputs are decimals, and the correct answer is 1008618.49, also a > decimal, and cannot be exactly represented as a binary floating point. I’m > not sure why in this case you want a binary floating point. > >> > >>> On Jan 30, 2024, at 2:46 PM, Mihai Budiu <mbu...@gmail.com> wrote: > >>> > >>> I am evaluating this expression: SELECT power(1004.3e0, 2e0) > >>> The result in Java, or Postgres, when formatted as a string, is > 1008618.4899999999 > >>> The result produced by the Calcite simplification code is 1008618.49 > >>> The simplification code can produce RexLiterals - that's where this > would be useful. > >>> This rounding error is not really necessary. > >>> > >>> Mihai > >>> ________________________________ > >>> From: Julian Hyde <jhyde.apa...@gmail.com> > >>> Sent: Tuesday, January 30, 2024 2:40 PM > >>> To: dev@calcite.apache.org <dev@calcite.apache.org> > >>> Subject: Re: Storing RexLiteral as BigDecimal values > >>> > >>> Can you give a scenario where a RexLiteral should have a double value? > >>> > >>>> On Jan 30, 2024, at 2:36 PM, Mihai Budiu <mbu...@gmail.com> wrote: > >>>> > >>>> Hello, > >>>> > >>>> I have a question about the representation of RexLiteral values. > >>>> Currently DOUBLE-valued literals are represented using a BigDecimal. > >>>> This causes small rounding errors, introduced in the > RexBuilder.clean() function. > >>>> This causes FP expressions that are evaluated at compilation-time to > produce results that are slightly off from the same expressions that may be > evaluated at runtime, for no real reason. For example, I am running some > Postgres tests for FP values, and they fail because of this small > difference. > >>>> > >>>> I know that FP values cannot be compared for equality, and tests are > supposed to have some slack, but I think that this particular rounding > error is not necessary. > >>>> > >>>> Why can't RexLiteral actually store a Double value internally when > the type is Double? > >>>> Is this a bug or is there a deeper reason for this representation? > >>>> If it's a bug I can file a JIRA issue and probably fix it. > >>>> > >>>> Thank you, > >>>> Mihai > >>> > >> > > > >