Re: Storing RexLiteral as BigDecimal values

Mihai Budiu Tue, 30 Jan 2024 17:28:24 -0800

I think we should start a new thread.

Mihai


________________________________
From: Stephen Carlin <scar...@cloudera.com.INVALID>
Sent: Tuesday, January 30, 2024 5:19 PM
To: dev@calcite.apache.org <dev@calcite.apache.org>
Subject: Re: Storing RexLiteral as BigDecimal values

I don't know if this is hijacking or adding to the discussion, but figured
I'd write here.

I'm currently implementing Calcite on my database, but I'm actually running
into a problem where my db would rather have exact types on some of the
lower integers.  LIke, if we have the value 38, I need this to be set as a
TINYINT, not an INT.

I guess Calcite doesn't really care if things are TINYINT or INT because of
it treats NUMERIC classes the same, but my database does care.  One example
is that my "=" function needs both parameters to be TINYINT or INT (or the
other integer values), and spits up otherwise.    I currently have a crazy
type coercion rule to get around this limitation.  I know there are
existing type coercions built in for the "=" function and others, but it
doesn't cover all my cases and it seems too hacky and piecemeal to
implement it that way.

Would this be something I should add as some kind of Dialect?  Or is the
Calcite community against these type of additions.

On Tue, Jan 30, 2024 at 4:13 PM Mihai Budiu <mbu...@gmail.com> wrote:

> Highly relevant discussion:
> https://issues.apache.org/jira/browse/CALCITE-2067
>
> This change would indeed enable representing other DOUBLE values which
> cannot be represented by BigDecimal, such as NaN, Infinities, and -0. It
> looks like the evaluator does not reduce such expressions.
>
> That issue also had a PR, which was rejected at the time, but I think a
> new PR would look relatively similar. Julian's objection at the time was
> that this changes the SQL language; however, this doesn't change the syntax
> of the language, just the representation that the evaluator uses
> internally. This would enable changes in the syntax too, but they can be
> done separately if desired.
>
> Mihai
>
> ________________________________
> From: Mihai Budiu <mbu...@gmail.com>
> Sent: Tuesday, January 30, 2024 4:00 PM
> To: dev@calcite.apache.org <dev@calcite.apache.org>
> Subject: Re: Storing RexLiteral as BigDecimal values
>
>
>   *
> Regarding [1], it looks like the power function implementations with
> DECIMAL arguments return DOUBLE. Where the BigDecimal Java type has custom
> functions that have potentially higher accuracy than FP versions, using
> them could be beneficial. Related to this, Calcite uses POWER to represent
> SQRT, but I noticed that SQRT gives more precise results. However, the Java
> implementation of Math.pow is very careful, and seems to be optimized for
> the case of integer arguments:
> https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#pow-double-double-
>
>
>   *   This is what POWER looks like in the SqlStdOperatorTable, which
> suggests that even the type checker assumes that this function always
> returns DOUBLE:
>
> /** The {@code POWER(numeric, numeric)} function.
>    *
>    * <p>The return type is always {@code DOUBLE} since we don't know
>    * what the result type will be by just looking at the operand types. For
>    * example {@code POWER(INTEGER, INTEGER)} can return a non-INTEGER if
> the
>    * second operand is negative.
>    */
>   public static final SqlBasicFunction POWER =
>       SqlBasicFunction.create("POWER",
>           ReturnTypes.DOUBLE_NULLABLE,
>           OperandTypes.NUMERIC_NUMERIC,
>           SqlFunctionCategory.NUMERIC);
>
>
>   *   Using a string representation for double literals, as you propose,
> still introduces rounding problems, when the Double value is converted to a
> string. So I think this will just replace one problem with another.
>
>
>   *
> Finally, regarding the subject of this email, the right solution seems to
> allow RexLiteral with a DOUBLE type (and similarly for REAL) to store
> values either as BigDecimal, which is suitable when the literals come from
> the parser, or as Double, which is suitable when values come from the
> evaluator. I plan to file a JIRA issue to enable this feature.
>
> Thank you,
> Mihai
>
> ________________________________
> From: Julian Hyde <jhyde.apa...@gmail.com>
> Sent: Tuesday, January 30, 2024 3:42 PM
> To: dev@calcite.apache.org <dev@calcite.apache.org>
> Subject: Re: Storing RexLiteral as BigDecimal values
>
> In the case of POWER, Postgres has both a decimal and double version[1],
> and Calcite should do the same.
>
> However, I take your point that if the function returns a double, then
> constant-reduction should create a double literal.
>
> I am concerned that developers will accidentally create double literals,
> when double literals are not possible in SQL (until we implement DOUBLE
> ‘3.14’ syntax, as discussed in [2]). Until then, is it better to store the
> Rex equivalent of CAST(‘3.14’ AS DOUBLE)?
>
> Julian
>
> [1] https://www.postgresql.org/docs/8.2/functions-math.html
> [2] https://issues.apache.org/jira/browse/CALCITE-6052
>
> > On Jan 30, 2024, at 3:21 PM, Mihai Budiu <mbu...@gmail.com> wrote:
> >
> > Well, irrespective of the type of the arguments, the POWER function
> returns a double value.
> >
> > These are the 4 implementations in SqlFunctions:
> >
> > public static double power(double b0, double b1) {
> >    return Math.pow(b0, b1);
> >  }
> >
> >  public static double power(double b0, BigDecimal b1) {
> >    return Math.pow(b0, b1.doubleValue());
> >  }
> >
> >  public static double power(BigDecimal b0, double b1) {
> >    return Math.pow(b0.doubleValue(), b1);
> >  }
> >
> >  public static double power(BigDecimal b0, BigDecimal b1) {
> >    return Math.pow(b0.doubleValue(), b1.doubleValue());
> >  }
> >
> > The problem is that, for the query below, the optimizer will then
> convert the double result of power into a BigDecimal, rounding it in the
> process.
> >
> > Mihai
> > ________________________________
> > From: Julian Hyde <jhyde.apa...@gmail.com>
> > Sent: Tuesday, January 30, 2024 3:03 PM
> > To: dev@calcite.apache.org <dev@calcite.apache.org>
> > Subject: Re: Storing RexLiteral as BigDecimal values
> >
> > It’s surprising to me that 1004.3e0 should have type DOUBLE or REAL; I
> would expect it to have type DECIMAL(5, 1), where it can be represented
> exactly.
> >
> >> On Jan 30, 2024, at 2:53 PM, Mihai Budiu <mbu...@gmail.com> wrote:
> >>
> >> e0 shows that these are DOUBLE values.
> >> Moreover, power returns a DOUBLE value.
> >> In FP the result is the wrong one, but that's the semantics of the
> power function in FP.
> >>
> >> Mihai
> >> ________________________________
> >> From: Julian Hyde <jhyde.apa...@gmail.com>
> >> Sent: Tuesday, January 30, 2024 2:50 PM
> >> To: dev@calcite.apache.org <dev@calcite.apache.org>
> >> Subject: Re: Storing RexLiteral as BigDecimal values
> >>
> >> The inputs are decimals, and the correct answer is 1008618.49, also a
> decimal, and cannot be exactly represented as a binary floating point. I’m
> not sure why in this case you want a binary floating point.
> >>
> >>> On Jan 30, 2024, at 2:46 PM, Mihai Budiu <mbu...@gmail.com> wrote:
> >>>
> >>> I am evaluating this expression: SELECT power(1004.3e0, 2e0)
> >>> The result in Java, or Postgres, when formatted as a string, is
> 1008618.4899999999
> >>> The result produced by the Calcite simplification code is 1008618.49
> >>> The simplification code can produce RexLiterals - that's where this
> would be useful.
> >>> This rounding error is not really necessary.
> >>>
> >>> Mihai
> >>> ________________________________
> >>> From: Julian Hyde <jhyde.apa...@gmail.com>
> >>> Sent: Tuesday, January 30, 2024 2:40 PM
> >>> To: dev@calcite.apache.org <dev@calcite.apache.org>
> >>> Subject: Re: Storing RexLiteral as BigDecimal values
> >>>
> >>> Can you give a scenario where a RexLiteral should have a double value?
> >>>
> >>>> On Jan 30, 2024, at 2:36 PM, Mihai Budiu <mbu...@gmail.com> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> I have a question about the representation of RexLiteral values.
> >>>> Currently DOUBLE-valued literals are represented using a BigDecimal.
> >>>> This causes small rounding errors, introduced in the
> RexBuilder.clean() function.
> >>>> This causes FP expressions that are evaluated at compilation-time to
> produce results that are slightly off from the same expressions that may be
> evaluated at runtime, for no real reason. For example, I am running some
> Postgres tests for FP values, and they fail because of this small
> difference.
> >>>>
> >>>> I know that FP values cannot be compared for equality, and tests are
> supposed to have some slack, but I think that this particular rounding
> error is not necessary.
> >>>>
> >>>> Why can't RexLiteral actually store a Double value internally when
> the type is Double?
> >>>> Is this a bug or is there a deeper reason for this representation?
> >>>> If it's a bug I can file a JIRA issue and probably fix it.
> >>>>
> >>>> Thank you,
> >>>> Mihai
> >>>
> >>
> >
>
>

Re: Storing RexLiteral as BigDecimal values

Reply via email to