[
https://issues.apache.org/jira/browse/KYLIN-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928354#comment-17928354
]
Guoliang Sun edited comment on KYLIN-6045 at 2/21/25 7:26 AM:
--------------------------------------------------------------
h3. Dev Design
# Expand Decimal Precision for RelNode SUM
## This approach resolves the issue of `SUM` on `Decimal` potentially
returning `null`. However, it has the following limitations:
### Due to the boundary at 19, it cannot resolve the division precision issue
mentioned before.
### The two semantic inconsistencies highlighted in the RC cannot be
addressed.
# Remove the 19 Boundary and Unify SUM Decimal Precision Semantics Across All
Layers
## Adjustments to KylinRelDataTypeSystem:
### Remove the boundary at 19. For the return type of `SUM` on `Decimal`,
uniformly increase the input column's `Decimal` precision by +10 (up to a
maximum of 38).
### This behavior aligns with the implementations in Spark and Hive.
By unifying the semantics across all layers, this approach ensures consistency
in `SUM` precision handling, resolving both the `null` result issue and the
semantic inconsistencies.
was (Author: JIRAUSER298470):
h3. Dev Design
# Expand Decimal Precision for RelNode SUM
## This approach resolves the issue of `SUM` on `Decimal` potentially
returning `null`. However, it has the following limitations:
### Due to the boundary at 19, it cannot resolve the division precision issue
mentioned in KE-43614.
### The two semantic inconsistencies highlighted in the RC cannot be
addressed.
# Remove the 19 Boundary and Unify SUM Decimal Precision Semantics Across All
Layers
## Adjustments to KylinRelDataTypeSystem:
### Remove the boundary at 19. For the return type of `SUM` on `Decimal`,
uniformly increase the input column's `Decimal` precision by +10 (up to a
maximum of 38).
### This behavior aligns with the implementations in Spark and Hive.
By unifying the semantics across all layers, this approach ensures consistency
in `SUM` precision handling, resolving both the `null` result issue and the
semantic inconsistencies.
> SUM Query Decimal Precision Anomaly
> -----------------------------------
>
> Key: KYLIN-6045
> URL: https://issues.apache.org/jira/browse/KYLIN-6045
> Project: Kylin
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: Guoliang Sun
> Priority: Major
>
> When generating the Spark plan for a query, a `cast` conversion is added for
> the `sum` aggregation in `AggregatePlan.buildAgg`. At this point, the input
> type is the column type, causing the precision of the `cast` to be reduced.
> This results in the query returning `null`.
> h3. Example
> - The column precision in the Hive table is `decimal(19,6)`.
> - The model measure precision is `decimal(29,6)`.
> - When querying, the result will be `null`.
> In the Spark event log for the query, the `cast` precision is
> `decimal(19,6)`. Directly retrieving data from the Parquet file yields the
> following:
> - When the `cast` precision is `DECIMAL(19,6)`, the result is `null`.
> - When the `cast` precision is `DECIMAL(29,6)`, the result is correct.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)