[jira] [Comment Edited] (KYLIN-6045) SUM Query Decimal Precision Anomaly

Guoliang Sun (Jira) Fri, 21 Feb 2025 00:22:20 -0800


    [ 
https://issues.apache.org/jira/browse/KYLIN-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928354#comment-17928354
 ]


Guoliang Sun edited comment on KYLIN-6045 at 2/21/25 7:26 AM:
--------------------------------------------------------------

h3. Dev Design
 # Expand Decimal Precision for RelNode SUM
 ## This approach resolves the issue of `SUM` on `Decimal` potentially 
returning `null`.  However, it has the following limitations:  
 ### Due to the boundary at 19, it cannot resolve the division precision issue 
mentioned before.  
 ### The two semantic inconsistencies highlighted in the RC cannot be 
addressed. 
 # Remove the 19 Boundary and Unify SUM Decimal Precision Semantics Across All 
Layers
 ## Adjustments to KylinRelDataTypeSystem:
 ### Remove the boundary at 19. For the return type of `SUM` on `Decimal`, 
uniformly increase the input column's `Decimal` precision by +10 (up to a 
maximum of 38). 
 ### This behavior aligns with the implementations in Spark and Hive.  

By unifying the semantics across all layers, this approach ensures consistency 
in `SUM` precision handling, resolving both the `null` result issue and the 
semantic inconsistencies.


was (Author: JIRAUSER298470):
h3. Dev Design
 # Expand Decimal Precision for RelNode SUM
 ## This approach resolves the issue of `SUM` on `Decimal` potentially 
returning `null`.  However, it has the following limitations:  
 ### Due to the boundary at 19, it cannot resolve the division precision issue 
mentioned in KE-43614.  
 ### The two semantic inconsistencies highlighted in the RC cannot be 
addressed. 
 # Remove the 19 Boundary and Unify SUM Decimal Precision Semantics Across All 
Layers
 ## Adjustments to KylinRelDataTypeSystem:
 ### Remove the boundary at 19. For the return type of `SUM` on `Decimal`, 
uniformly increase the input column's `Decimal` precision by +10 (up to a 
maximum of 38). 
 ### This behavior aligns with the implementations in Spark and Hive.  

By unifying the semantics across all layers, this approach ensures consistency 
in `SUM` precision handling, resolving both the `null` result issue and the 
semantic inconsistencies.

> SUM Query Decimal Precision Anomaly
> -----------------------------------
>
>                 Key: KYLIN-6045
>                 URL: https://issues.apache.org/jira/browse/KYLIN-6045
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Guoliang Sun
>            Priority: Major
>
> When generating the Spark plan for a query, a `cast` conversion is added for 
> the `sum` aggregation in `AggregatePlan.buildAgg`. At this point, the input 
> type is the column type, causing the precision of the `cast` to be reduced. 
> This results in the query returning `null`.
> h3. Example
> - The column precision in the Hive table is `decimal(19,6)`.  
> - The model measure precision is `decimal(29,6)`.  
> - When querying, the result will be `null`.  
> In the Spark event log for the query, the `cast` precision is 
> `decimal(19,6)`. Directly retrieving data from the Parquet file yields the 
> following:  
> - When the `cast` precision is `DECIMAL(19,6)`, the result is `null`.  
> - When the `cast` precision is `DECIMAL(29,6)`, the result is correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KYLIN-6045) SUM Query Decimal Precision Anomaly

Reply via email to